← Back to all projects

LEARN ADVANCED GIT WORKFLOWS

Learn Advanced Git Workflows: From Basic Commits to Monorepo Mastery

Goal: Deeply understand how Git actually works under the hood—from the object database and DAG structure to advanced workflows like trunk-based development, sophisticated rebase strategies, professional code review flows, and monorepo patterns. You’ll understand not just the commands, but why they work, what problems they solve, and how to architect Git workflows for teams of any size.


Why Advanced Git Workflows Matter

In 2005, Linus Torvalds created Git in just two weeks to manage Linux kernel development after the proprietary BitKeeper revoked its free license. His design was radical: a distributed version control system where every developer has the complete history, where branching is nearly instantaneous, and where the data model is built on cryptographic integrity.

The scale of Git’s impact:

  • Over 100 million repositories on GitHub alone
  • Linux kernel: 1.3+ million commits, 25,000+ contributors
  • Google’s monorepo: 2+ billion lines of code, 86TB of data
  • Microsoft’s Windows: 3.5 million files, largest Git repo ever migrated

Why most developers never go beyond basics:

  • Git’s porcelain (user-facing) commands hide the plumbing (internal operations)
  • Most tutorials teach “git add, commit, push” without explaining the object model
  • Workflow decisions (merge vs. rebase, trunk vs. feature branches) are made without understanding tradeoffs
  • Monorepo challenges are only discovered at scale

What understanding workflows unlocks:

  • Debug any Git situation by understanding the underlying data structure
  • Choose the right workflow for your team’s needs
  • Implement CI/CD pipelines that leverage Git’s capabilities
  • Scale repositories from solo projects to enterprise monorepos

Core Concept Analysis

The Git Object Model: Everything is Content-Addressable

Before understanding workflows, you must understand Git’s foundation: the object database.

┌─────────────────────────────────────────────────────────────────┐
│                    .git/objects/                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   BLOB (file content)        TREE (directory)                   │
│   ┌──────────────────┐       ┌──────────────────┐              │
│   │ "Hello, World!"  │       │ 100644 blob abc  │              │
│   │                  │       │        → README  │              │
│   │ SHA: abc123...   │       │ 040000 tree def  │              │
│   └──────────────────┘       │        → src/    │              │
│                              │ SHA: def456...   │              │
│                              └──────────────────┘              │
│                                                                 │
│   COMMIT                     TAG                                │
│   ┌──────────────────┐       ┌──────────────────┐              │
│   │ tree: def456     │       │ object: ghi789   │              │
│   │ parent: 000000   │       │ type: commit     │              │
│   │ author: Alice    │       │ tag: v1.0.0      │              │
│   │ message: "Init"  │       │ tagger: Bob      │              │
│   │ SHA: ghi789...   │       │ SHA: jkl012...   │              │
│   └──────────────────┘       └──────────────────┘              │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Key insight: Git stores SNAPSHOTS, not diffs.
Each commit points to a complete tree of the entire project.

The Commit DAG (Directed Acyclic Graph)

Git history is not a linear sequence—it’s a graph where commits point to their parents.

Simple linear history:
A ← B ← C ← D ← E  (HEAD → main)

Branching:
A ← B ← C ← D ← E       (HEAD → main)
         ↖
          F ← G ← H     (feature)

After merge:
A ← B ← C ← D ← E ← M   (HEAD → main)
         ↖         ↗
          F ← G ← H     (feature)

After rebase (feature onto main):
A ← B ← C ← D ← E       (main)
                  ↖
                   F' ← G' ← H'  (HEAD → feature)

Note: F', G', H' are NEW commits with different SHAs
      (same content, different parent = different hash)

Trunk-Based Development vs. Feature Branch Flow

┌─────────────────────────────────────────────────────────────────┐
│              GITFLOW (Long-lived branches)                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  main     ─────○─────────────────○─────────────────○────        │
│                ↑                 ↑                 ↑            │
│  release  ─────┼───○───○────────┼───○───○────────┼─────        │
│                │   ↑   ↑        │   ↑   ↑        │             │
│  develop  ─○─○─┼─○─┼───┼─○─○─○──┼───┼───┼─○─○────┼────         │
│            ↑   │   │   │ ↑   ↑  │   │   │ ↑      │             │
│  feature/  └───┘   │   │ └───┘  │   │   │ └──────┘             │
│            ↑       │   │        │   │   │                      │
│  hotfix    └───────┴───┘        └───┴───┘                      │
│                                                                 │
│  Problems: Merge hell, long-lived branches, integration pain    │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│              TRUNK-BASED DEVELOPMENT                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  main ────○──○──○──○──○──○──○──○──○──○──○──○──○──○──○───→       │
│           ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑  ↑          │
│           │  │  │  │  │  │  │  │  │  │  │  │  │  │  │          │
│  short-   └──┘  └──┘  └──┘  └──┘  └──┘  └──┘  └──┘  └──        │
│  lived                                                          │
│  branches (< 1 day, max 2 days)                                 │
│                                                                 │
│  Key practices:                                                 │
│  • Feature flags for incomplete work                            │
│  • Small, frequent commits                                      │
│  • Continuous integration on every push                         │
│  • No long-lived branches                                       │
└─────────────────────────────────────────────────────────────────┘

Merge vs. Rebase: The Fundamental Tradeoff

BEFORE (same starting point):
A ← B ← C           (main)
         ↖
          D ← E     (feature)

┌─────────────────────────────────────────────────────────────────┐
│                        MERGE                                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  A ← B ← C ←───── M     (main after merge)                     │
│           ↖     ↗                                               │
│            D ← E        (feature)                               │
│                                                                 │
│  Pros:                           Cons:                          │
│  • Preserves exact history       • "Merge commit" clutter       │
│  • Non-destructive               • Non-linear history           │
│  • Safe for shared branches      • Harder to read               │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                        REBASE                                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  A ← B ← C ← D' ← E'    (feature after rebase onto main)       │
│                                                                 │
│  Pros:                           Cons:                          │
│  • Clean, linear history         • Rewrites history (new SHAs)  │
│  • Easy to follow                • NEVER on shared branches     │
│  • Better for bisect             • Can lose merge context       │
└─────────────────────────────────────────────────────────────────┘

THE GOLDEN RULE:
"Never rebase commits that exist outside your repository"
(i.e., commits you've already pushed to shared branches)

Interactive Rebase: The Power Tool

git rebase -i HEAD~5

pick   abc1234  Add user model
squash def5678  Fix typo in user model        ← combines with previous
reword ghi9012  Add authentication           ← edit commit message
edit   jkl3456  Add password hashing         ← pause here to amend
drop   mno7890  WIP: debugging stuff         ← delete this commit

Result: Clean, logical commit history ready for code review

┌────────────────────────────────────────────────────────────────┐
│  Commands available in interactive rebase:                     │
├────────────────────────────────────────────────────────────────┤
│  p, pick   = use commit                                        │
│  r, reword = use commit, but edit the commit message           │
│  e, edit   = use commit, but stop for amending                 │
│  s, squash = use commit, but meld into previous commit         │
│  f, fixup  = like "squash", but discard this commit's log msg  │
│  x, exec   = run command (the rest of the line) using shell    │
│  d, drop   = remove commit                                     │
└────────────────────────────────────────────────────────────────┘

Code Review Flows: From PR to Merge

┌─────────────────────────────────────────────────────────────────┐
│                    PULL REQUEST LIFECYCLE                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Developer                    Reviewer                          │
│  ─────────                    ────────                          │
│      │                            │                             │
│      │ 1. Create feature branch   │                             │
│      ├─────────────────────────>  │                             │
│      │                            │                             │
│      │ 2. Push commits            │                             │
│      ├─────────────────────────>  │                             │
│      │                            │                             │
│      │ 3. Open PR                 │                             │
│      ├─────────────────────────>  │                             │
│      │                            │ 4. Review code              │
│      │  <─────────────────────────┤                             │
│      │    Request changes         │                             │
│      │                            │                             │
│      │ 5. Address feedback        │                             │
│      ├─────────────────────────>  │                             │
│      │   (force push if rebase)   │                             │
│      │                            │                             │
│      │  <─────────────────────────┤ 6. Approve                  │
│      │                            │                             │
│      │ 7. Squash and merge        │                             │
│      └─────────────────────────>  │                             │
│                                   │                             │
└─────────────────────────────────────────────────────────────────┘

Merge strategies on PR completion:
• Merge commit    : Preserves all commits + merge commit
• Squash merge    : All PR commits → single commit on main
• Rebase merge    : Replay commits on top of main (no merge commit)

Monorepo Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    POLYREPO (Traditional)                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐                  │
│  │ repo-api │    │ repo-web │    │ repo-lib │                  │
│  ├──────────┤    ├──────────┤    ├──────────┤                  │
│  │ .git/    │    │ .git/    │    │ .git/    │                  │
│  │ src/     │    │ src/     │    │ src/     │                  │
│  │ tests/   │    │ tests/   │    │ tests/   │                  │
│  └──────────┘    └──────────┘    └──────────┘                  │
│        ↑              ↑              ↑                          │
│        └──────────────┴──────────────┘                          │
│           npm install from registry                             │
│                                                                 │
│  Problems: Dependency versioning, cross-repo changes,           │
│           inconsistent tooling, "diamond dependency" hell       │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                    MONOREPO (Single Repository)                 │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │ monorepo/                                               │   │
│  ├─────────────────────────────────────────────────────────┤   │
│  │ .git/                                                   │   │
│  │ packages/                                               │   │
│  │   ├── api/        (can import from lib directly)       │   │
│  │   ├── web/        (can import from lib directly)       │   │
│  │   └── lib/        (shared code)                        │   │
│  │ tools/                                                  │   │
│  │   └── build-system/                                     │   │
│  │ nx.json / turbo.json                                    │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  Benefits: Atomic commits across packages, shared tooling,      │
│           single version of dependencies, easier refactoring    │
│                                                                 │
│  Challenges: Scale (clone time, CI time), permission control,   │
│             finding what changed in large diffs                 │
└─────────────────────────────────────────────────────────────────┘

Monorepo tools comparison:
┌─────────────┬───────────────┬───────────────┬──────────────────┐
│ Tool        │ Task Caching  │ Affected Cmds │ Language Support │
├─────────────┼───────────────┼───────────────┼──────────────────┤
│ Nx          │ Local+Remote  │ Yes           │ JS/TS, Go, Rust  │
│ Turborepo   │ Local+Remote  │ Limited       │ JS/TS primarily  │
│ Bazel       │ Remote        │ Yes           │ Polyglot         │
│ Lerna       │ No (legacy)   │ Yes           │ JS/TS            │
│ Rush        │ Local         │ Yes           │ JS/TS            │
└─────────────┴───────────────┴───────────────┴──────────────────┘

Git Internals: The Plumbing Commands

┌─────────────────────────────────────────────────────────────────┐
│              PORCELAIN vs PLUMBING                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  PORCELAIN (User-friendly)    PLUMBING (Low-level)             │
│  ─────────────────────────    ─────────────────────            │
│  git add                      git hash-object                   │
│  git commit                   git update-index                  │
│  git checkout                 git read-tree                     │
│  git merge                    git write-tree                    │
│  git rebase                   git commit-tree                   │
│  git log                      git cat-file                      │
│  git status                   git ls-files                      │
│  git diff                     git rev-parse                     │
│                               git update-ref                    │
│                               git symbolic-ref                  │
│                                                                 │
│  Understanding plumbing = understanding Git                     │
└─────────────────────────────────────────────────────────────────┘

Example: What "git commit" actually does:

1. git write-tree        → Create tree object from index
2. git commit-tree       → Create commit object pointing to tree
3. git update-ref HEAD   → Update HEAD to point to new commit

You can manually create commits using only plumbing commands!

Concept Summary Table

Concept Cluster What You Need to Internalize
Object Model Git stores snapshots (blobs, trees, commits, tags) in a content-addressable database. Every object is identified by its SHA-1 hash.
The DAG Commits form a directed acyclic graph where each commit points to its parent(s). Branches are just pointers to commits.
Merge vs. Rebase Merge preserves history with merge commits; rebase rewrites history for linearity. Never rebase shared branches.
Trunk-Based Dev All developers commit to main frequently (daily), using feature flags for incomplete work. Minimizes merge conflicts.
Interactive Rebase Rewrite local history: squash, reorder, edit, drop commits. Essential for clean PRs.
Code Review Flow PRs gate changes. Reviewers enforce quality. Squash/rebase on merge for clean main history.
Monorepo Patterns Single repo for multiple projects. Requires affected commands, task caching, and smart CI to scale.
Plumbing Commands Low-level commands that porcelain builds upon. Understanding these = understanding Git.

Deep Dive Reading by Concept

Git Internals

Concept Book & Chapter
Object model (blobs, trees, commits) Pro Git by Scott Chacon — Ch. 10.1-10.2: “Git Internals - Plumbing and Porcelain”
Pack files and garbage collection Pro Git by Scott Chacon — Ch. 10.4: “Packfiles”
How refs work Pro Git by Scott Chacon — Ch. 10.3: “Git References”

Branching and Merging

Concept Book & Chapter
Branch mechanics Pro Git by Scott Chacon — Ch. 3.1: “Branches in a Nutshell”
Merge strategies Pro Git by Scott Chacon — Ch. 3.2: “Basic Branching and Merging”
Rebase fundamentals Pro Git by Scott Chacon — Ch. 3.6: “Rebasing”
Advanced rebasing Git Internals by Scott Chacon (Peepcode PDF) — Ch. 5: “Rebasing”

Workflows

Concept Book & Chapter
Distributed workflows Pro Git by Scott Chacon — Ch. 5.1-5.2: “Distributed Workflows”
Contributing to projects Pro Git by Scott Chacon — Ch. 5.3: “Maintaining a Project”
Trunk-based development Accelerate by Nicole Forsgren — Ch. 4: “Technical Practices”
Continuous delivery Continuous Delivery by Jez Humble — Ch. 14: “Version Control”

Monorepos

Concept Book & Chapter
Monorepo philosophy Software Engineering at Google by Winters et al. — Ch. 16: “Version Control and Branch Management”
Scaling build systems Software Engineering at Google — Ch. 18: “Build Systems and Build Philosophy”

Essential Reading Order

For maximum comprehension, read in this order:

  1. Foundation (Week 1):
    • Pro Git Ch. 10.1-10.3 (Git internals)
    • Pro Git Ch. 3.1-3.2 (Branching basics)
  2. Intermediate (Week 2):
    • Pro Git Ch. 3.6 (Rebasing)
    • Pro Git Ch. 5.1-5.3 (Distributed workflows)
  3. Advanced (Week 3-4):
    • Accelerate Ch. 4 (Trunk-based development)
    • Software Engineering at Google Ch. 16 (Version control at scale)

Project 1: Git Object Explorer

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Rust, C
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Git Internals / Binary Parsing
  • Software or Tool: Git
  • Main Book: Pro Git by Scott Chacon

What you’ll build: A tool that explores the .git directory, decompresses and parses Git objects (blobs, trees, commits, tags), and displays their contents in human-readable format with SHA verification.

Why it teaches Git workflows: Before you can master advanced workflows, you need to see what Git actually stores. This project forces you to understand that commits are just text files with parent pointers, branches are just files containing SHAs, and the entire history is a content-addressable database.

Core challenges you’ll face:

  • Decompressing zlib-compressed objects → maps to understanding Git’s storage format
  • Parsing different object types → maps to understanding blob vs. tree vs. commit structure
  • Following parent pointers to reconstruct history → maps to understanding the DAG
  • Verifying SHA-1 hashes → maps to understanding content-addressability

Key Concepts:

  • Object storage format: Pro Git Ch. 10.2 — Scott Chacon
  • Zlib compression: Python zlib module documentation
  • SHA-1 hashing: Serious Cryptography Ch. 6 — Aumasson

Difficulty: Intermediate Time estimate: Weekend Prerequisites: Python file I/O, basic understanding of hashing, familiarity with command-line git


Real World Outcome

You’ll have a command-line tool that can inspect any Git repository’s internal structure. When you run it, you’ll see the raw objects that make up Git’s database:

Example Output:

$ ./git-explorer /path/to/repo

=== Git Object Explorer ===
Repository: /path/to/repo

Scanning .git/objects...
Found 247 objects

--- Object: 3b18e512dba79e4c8300dd08aeb37f8e728b8dad ---
Type: commit
Size: 243 bytes
SHA verified: ✓

tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
parent a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0
author Alice <alice@example.com> 1703001600 -0800
committer Alice <alice@example.com> 1703001600 -0800

Add user authentication feature

--- Object: 4b825dc642cb6eb9a060e54bf8d69288fbee4904 ---
Type: tree
Size: 66 bytes

100644 blob abc123...  README.md
100755 blob def456...  src/main.py
040000 tree ghi789...  tests/

--- Ref: refs/heads/main ---
Points to: 3b18e512dba79e4c8300dd08aeb37f8e728b8dad

$ ./git-explorer --follow 3b18e512

Commit graph from 3b18e512:
3b18e51 ← a1b2c3d ← 5e6f7g8 ← (root)
    │
    └── "Add user authentication feature"

The Core Question You’re Answering

“What IS a Git commit? What does Git actually store, and how does branching work at the byte level?”

Before you write any code, sit with this question. Most developers think of commits as “diffs” or “changes,” but Git stores complete snapshots. A branch isn’t a separate copy of files—it’s just a 41-byte file containing a SHA hash.


Concepts You Must Understand First

Stop and research these before coding:

  1. Content-Addressable Storage
    • What does “content-addressable” mean?
    • Why does changing one byte in a file create a completely different hash?
    • How does this enable Git’s integrity checking?
    • Book Reference: “Pro Git” Ch. 10.2 — Scott Chacon
  2. Zlib Compression
    • What algorithm does zlib use (hint: DEFLATE)?
    • Why does Git compress objects?
    • How do you identify compressed vs. uncompressed data?
    • Book Reference: Python zlib module documentation
  3. Object Format
    • What’s the header format for Git objects?
    • How do blob, tree, and commit objects differ structurally?
    • What’s the difference between object content and object hash input?
    • Book Reference: “Pro Git” Ch. 10.2 — Scott Chacon

Questions to Guide Your Design

Before implementing, think through these:

  1. Object Discovery
    • How will you find all objects in .git/objects/?
    • What about packed objects in .git/objects/pack/?
    • How do you handle the xx/yyyyyy... directory structure?
  2. Parsing Strategy
    • How will you detect the object type from the header?
    • How will you handle null bytes in binary data?
    • How will you parse tree entries (mode, name, SHA)?
  3. Verification
    • How do you verify the SHA matches the content?
    • What should happen if verification fails?
    • How do you handle corrupted objects?

Thinking Exercise

Trace a Commit’s Components

Before coding, manually inspect a real Git object:

# In any git repo, find an object
$ ls .git/objects/
3b/  4a/  5c/  info/  pack/

$ ls .git/objects/3b/
18e512dba79e4c8300dd08aeb37f8e728b8dad

# Decompress and view it
$ python3 -c "import zlib; print(zlib.decompress(open('.git/objects/3b/18e512dba79e4c8300dd08aeb37f8e728b8dad', 'rb').read()))"

Questions while tracing:

  • What’s the format of the header you see?
  • How many null bytes separate header from content?
  • If it’s a commit, what fields do you see?
  • Can you manually verify the SHA by hashing “type size\0content”?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “What’s the difference between a Git blob and a Git tree?”
  2. “Why can’t you have two different files with the same content in a Git repo?”
  3. “What happens internally when you run git add?”
  4. “Explain why changing a single character in a file changes the commit hash of every ancestor.”
  5. “How does Git know if a file has been modified without storing diffs?”

Hints in Layers

Hint 1: Starting Point Look inside .git/objects/. The first two characters of a SHA become a subdirectory name; the rest is the filename.

Hint 2: Reading Objects Every object starts with: {type} {size}\0{content}. Use zlib.decompress() to get the raw bytes first.

Hint 3: Parsing Types After decompression, split on the first null byte. Parse the header to get type and size. For trees, entries are: {mode} {filename}\0{20-byte SHA}.

Hint 4: Verification To verify a SHA, compute: sha1(f"{type} {len(content)}\0{content}"). The result should match the filename.


Books That Will Help

Topic Book Chapter
Git object model “Pro Git” by Scott Chacon Ch. 10.1-10.2
Binary file parsing in Python “Black Hat Python” by Justin Seitz Ch. 3
Content-addressable storage “Designing Data-Intensive Applications” by Kleppmann Ch. 3

Implementation Hints

Git objects are stored as: {type} {size}\0{content}, then zlib-compressed, then placed in .git/objects/{first-2-chars-of-sha}/{remaining-38-chars}.

Object types:

  • blob: Just raw file content
  • tree: List of entries, each with mode (6 bytes ASCII), space, filename, null byte, then 20-byte binary SHA
  • commit: Text with “tree”, “parent” (0 or more), “author”, “committer”, blank line, message

To build the DAG visualization, follow parent pointers recursively until you hit a commit with no parents (the root).

Learning milestones:

  1. You can decompress and read a blob → You understand the storage format
  2. You can parse tree entries and follow them → You understand directory structure
  3. You can follow parent pointers through commits → You understand the DAG
  4. You can verify SHAs match content → You understand content-addressability

Project 2: Commit Graph Visualizer

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Rust, JavaScript (D3.js for visualization)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Graph Algorithms / Git DAG
  • Software or Tool: Git, Graphviz or D3.js
  • Main Book: Pro Git by Scott Chacon

What you’ll build: A tool that reads a Git repository and generates a visual graph showing commits, branches, merges, and tags—revealing the true DAG structure that underlies Git history.

Why it teaches Git workflows: When you visualize the DAG, you finally understand why rebase “rewrites history” (it creates new commits with different parents), why merge creates a commit with two parents, and how branches are just movable pointers.

Core challenges you’ll face:

  • Walking the commit graph efficiently → maps to understanding parent relationships
  • Detecting merge commits (multiple parents) → maps to understanding merge vs. rebase
  • Positioning nodes for readability → maps to understanding branch topology
  • Mapping refs to commits → maps to understanding branches as pointers

Key Concepts:

  • DAG traversal: Grokking Algorithms Ch. 6 — Aditya Bhargava
  • Git refs: Pro Git Ch. 10.3 — Scott Chacon
  • Graph layout algorithms: Sugiyama algorithm for DAG visualization

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 completed, basic graph theory, understanding of topological sort


Real World Outcome

You’ll have a tool that generates visual representations of Git history, revealing the true graph structure:

Example Output:

$ ./git-graph --repo /path/to/repo --output graph.png

Analyzing repository...
Found 156 commits across 8 branches
Detected 12 merge commits
Generating graph...

Saved to: graph.png

ASCII preview:
    * 3b18e51 (HEAD -> main) Merge feature-auth
    |\
    | * a1b2c3d Add password hashing
    | * 5e6f7g8 Add login form
    |/
    * 9h0i1j2 Update README
    * k3l4m5n Initial commit

$ ./git-graph --repo /path/to/repo --format svg --show-refs

[Opens browser with interactive SVG showing]:
- Commit nodes colored by author
- Branch labels at their current positions
- Merge commits highlighted
- Clickable nodes showing commit details

The Core Question You’re Answering

“What does ‘rewriting history’ actually mean, and why is the commit graph a DAG, not a tree?”

Before you write any code, sit with this question. A DAG (Directed Acyclic Graph) allows multiple parents (merges) and multiple children (branches), but no cycles. Understanding this structure explains why you can’t have a commit that’s its own ancestor.


Concepts You Must Understand First

Stop and research these before coding:

  1. Graph Theory Basics
    • What’s the difference between a DAG and a tree?
    • What’s topological sorting and why does it matter for Git?
    • How do you detect cycles in a graph?
    • Book Reference: “Grokking Algorithms” Ch. 6 — Bhargava
  2. Git References
    • What’s the difference between a branch, a tag, and HEAD?
    • What does “detached HEAD” mean?
    • Where are refs stored in .git/?
    • Book Reference: “Pro Git” Ch. 10.3 — Chacon
  3. Merge Commit Structure
    • How does a merge commit differ from a regular commit?
    • What are the first and second parents of a merge?
    • What does git log --first-parent show?
    • Book Reference: “Pro Git” Ch. 3.2 — Chacon

Questions to Guide Your Design

Before implementing, think through these:

  1. Graph Construction
    • How will you build the graph in memory?
    • What data structure represents a commit node?
    • How do you handle the fact that Git stores parents, not children?
  2. Layout Algorithm
    • How do you position nodes so branches don’t overlap?
    • How do you handle very long linear histories?
    • Should time flow top-to-bottom or left-to-right?
  3. Branch Assignment
    • A commit can be reachable from multiple branches—how do you show this?
    • How do you determine the “main line” for display?

Thinking Exercise

Trace a Merge Manually

Create a test repository and trace what happens:

git init test-repo && cd test-repo
echo "initial" > file.txt && git add . && git commit -m "A"
git checkout -b feature
echo "feature" >> file.txt && git commit -am "B"
echo "more feature" >> file.txt && git commit -am "C"
git checkout main
echo "main work" >> file.txt && git commit -am "D"
git merge feature -m "E: Merge feature"

Questions while tracing:

  • Draw the DAG on paper. How many parents does commit E have?
  • If you now run git log --oneline, what order do you see?
  • What about git log --oneline --first-parent?
  • Look at .git/refs/heads/ — what files exist?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Explain the difference between merge and rebase using a graph diagram.”
  2. “What happens to the commit graph when you force push after a rebase?”
  3. “How would you find the common ancestor of two branches programmatically?”
  4. “Why does git log show commits in the order it does?”
  5. “What’s the complexity of finding if commit A is an ancestor of commit B?”

Hints in Layers

Hint 1: Starting Point Read all refs from .git/refs/ (branches and tags) and HEAD. Each points to a commit SHA.

Hint 2: Graph Building Starting from each ref, walk parent pointers recursively. Store visited commits to avoid cycles (though Git guarantees no cycles).

Hint 3: Layout Assign each branch a “lane” (column). Commits on the same branch go in the same lane. Merge commits connect lanes.

Hint 4: Tools Graphviz’s DOT format is simple: "sha1" -> "sha2" for edges. Let Graphviz handle layout with dot -Tpng.


Books That Will Help

Topic Book Chapter
Graph algorithms “Grokking Algorithms” by Bhargava Ch. 6
Git internal refs “Pro Git” by Chacon Ch. 10.3
DAG visualization “Graph Drawing” by Tamassia Ch. 9

Implementation Hints

Start by generating DOT format for Graphviz—it handles layout automatically:

digraph G {
    rankdir=BT;  // bottom to top
    node [shape=circle];

    "3b18e51" [label="E\nMerge"];
    "a1b2c3d" [label="C"];
    "5e6f7g8" [label="B"];
    "9h0i1j2" [label="D"];
    "k3l4m5n" [label="A"];

    "3b18e51" -> "a1b2c3d";  // first parent
    "3b18e51" -> "9h0i1j2";  // second parent (merge)
    "a1b2c3d" -> "5e6f7g8";
    "5e6f7g8" -> "k3l4m5n";
    "9h0i1j2" -> "k3l4m5n";
}

For branch labels, use Graphviz node attributes to add color or labels at the commit the branch points to.

Learning milestones:

  1. You can walk parent pointers and build a graph → You understand the DAG structure
  2. You can identify merge commits by parent count → You understand merge mechanics
  3. You can map refs to commits → You understand branches as pointers
  4. You can generate readable visualizations → You can explain Git history to others

Project 3: Interactive Rebase Simulator

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Rust, TypeScript
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Git Internals / State Machines
  • Software or Tool: Git
  • Main Book: Pro Git by Scott Chacon

What you’ll build: A tool that simulates git rebase -i without actually modifying the repository, showing exactly what commits would be created, dropped, or modified—and explaining why.

Why it teaches Git workflows: Interactive rebase is the most powerful and most misunderstood Git command. By simulating it step-by-step, you’ll understand that rebase doesn’t “move” commits—it creates new commits with the same changes but different parents (and therefore different SHAs).

Core challenges you’ll face:

  • Parsing the todo list format → maps to understanding rebase operations
  • Simulating commit replaying → maps to understanding how commits are recreated
  • Handling squash/fixup → maps to understanding commit combination
  • Predicting new SHAs → maps to understanding content-addressability

Key Concepts:

  • Rebase internals: Pro Git Ch. 3.6 — Scott Chacon
  • Interactive rebase commands: Pro Git Ch. 7.6 — Scott Chacon
  • State machine design: Clean Code Ch. 10 — Martin

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Project 1 and 2 completed, understanding of rebase vs merge


Real World Outcome

You’ll have a tool that shows exactly what an interactive rebase would do, step by step:

Example Output:

$ ./rebase-sim --repo /path/to/repo --onto main --branch feature

=== Interactive Rebase Simulator ===
Simulating: git rebase -i main feature

Current branch 'feature' has 5 commits not on 'main':
  abc1234 Add user model
  def5678 Fix typo in user model
  ghi9012 Add authentication
  jkl3456 WIP debugging
  mno7890 Add password hashing

Enter todo commands (or press Enter for default 'pick all'):
pick abc1234
squash def5678
pick ghi9012
drop jkl3456
pick mno7890

=== SIMULATION RESULTS ===

Step 1: pick abc1234 "Add user model"
  Old SHA: abc1234...
  New SHA: xyz7777... (different because parent changed!)
  Old parent: 111222... (old main)
  New parent: 999888... (current main tip)

Step 2: squash def5678 "Fix typo in user model"
  Combining with previous commit...
  New message will be:
    "Add user model

    Fix typo in user model"
  Combined SHA: uvw4444...

Step 3: pick ghi9012 "Add authentication"
  Old SHA: ghi9012...
  New SHA: rst5555...

Step 4: drop jkl3456 "WIP debugging"
  ⚠️  This commit will be DELETED
  Changes in this commit:
    - src/debug.py (will be lost!)

Step 5: pick mno7890 "Add password hashing"
  Old SHA: mno7890...
  New SHA: pqr6666...

=== FINAL STATE ===
main:    999888... (unchanged)
feature: pqr6666... (was: mno7890...)

Commits on feature after rebase: 4 (was 5)
Total commits created: 4 new, 5 old destroyed

WARNING: 1 commit dropped. Changes may be lost!

The Core Question You’re Answering

“Why does rebase ‘rewrite history,’ and what does that actually mean at the commit level?”

Before you write any code, sit with this question. When you rebase, Git doesn’t move commits—it replays the changes onto a new base, creating entirely new commits. The old commits still exist (until garbage collection), but your branch pointer moves to the new chain.


Concepts You Must Understand First

Stop and research these before coding:

  1. Commit Identity
    • What determines a commit’s SHA?
    • If you change just the parent, what happens to the SHA?
    • If you change the commit message, what happens to the SHA?
    • Book Reference: “Pro Git” Ch. 10.2 — Chacon
  2. Rebase Operations
    • What does each rebase command (pick, squash, fixup, reword, edit, drop) do?
    • How does squash differ from fixup?
    • What happens during a rebase conflict?
    • Book Reference: “Pro Git” Ch. 7.6 — Chacon
  3. The Three-Way Merge
    • How does Git apply changes from one commit onto another?
    • What’s the “merge base” in a rebase context?
    • Why can rebase produce different conflicts than merge?
    • Book Reference: “Pro Git” Ch. 3.2 — Chacon

Questions to Guide Your Design

Before implementing, think through these:

  1. Simulation Fidelity
    • How will you calculate what the new SHA would be without creating objects?
    • Can you predict if there would be conflicts?
    • How do you represent the state after each step?
  2. Commit Combination
    • When squashing, how do you combine commit messages?
    • When squashing, how do you combine tree states?
    • What if squashed commits touched the same file?
  3. User Interface
    • How do you present the todo list for editing?
    • How do you show the diff between old and new state?
    • How do you warn about potentially lost changes?

Thinking Exercise

Trace a Rebase Manually

Create and rebase a test branch:

git init test && cd test
echo "a" > file && git add . && git commit -m "A"
echo "b" > file && git commit -am "B"
git checkout -b feature
echo "c" > file && git commit -am "C"
echo "d" > file && git commit -am "D"
git checkout main
echo "e" > file && git commit -am "E"
git checkout feature
git log --oneline --all --graph  # Note the SHAs
git rebase main
git log --oneline --all --graph  # Compare the SHAs

Questions while tracing:

  • What are the SHAs of C and D before rebase?
  • What are the SHAs of C’ and D’ after rebase?
  • Can you find the old commits with git reflog?
  • What happened to the parent pointers?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Explain what happens step-by-step when you run git rebase main from a feature branch.”
  2. “Why should you never rebase commits that have been pushed to a shared branch?”
  3. “What’s the difference between git rebase -i with squash versus fixup?”
  4. “How would you recover commits that were ‘lost’ during a rebase?”
  5. “When would you use rebase vs. merge, and what are the tradeoffs?”

Hints in Layers

Hint 1: Starting Point Parse the todo file format: <command> <sha> <message>. The commands are: pick, reword, edit, squash, fixup, drop.

Hint 2: SHA Calculation A commit’s SHA is sha1(f"commit {size}\0{content}"). The content includes tree, parent(s), author, committer, and message.

Hint 3: Squash Logic When squashing, the tree comes from applying both commits’ changes, and the message combines both (unless fixup, which discards the second message).

Hint 4: Conflict Detection To predict conflicts, you’d need to simulate the three-way merge. For this project, you can note “potential conflict” when the same file is modified.


Books That Will Help

Topic Book Chapter
Rebase in depth “Pro Git” by Chacon Ch. 3.6, 7.6
Three-way merge “Version Control with Git” by Loeliger Ch. 9
Reflog and recovery “Pro Git” by Chacon Ch. 7.3

Implementation Hints

The rebase todo format is straightforward:

pick abc1234 First commit message
squash def5678 Second commit message
reword ghi9012 Third commit message

For simulation, track the state as you process each line:

  • pick: New commit with same tree, new parent
  • squash/fixup: Combine with previous commit
  • reword: New commit with modified message
  • drop: Skip entirely
  • edit: Pause (in your simulation, just note it)

To compute what the new SHA would be, you need:

  1. The tree SHA (same as original for pick/reword/drop)
  2. The new parent SHA (previous simulated commit or rebase base)
  3. The author info (usually preserved)
  4. The committer info (YOU, at current time)
  5. The message

Learning milestones:

  1. You can parse and validate a todo list → You understand rebase operations
  2. You can simulate pick and compute new SHAs → You understand commit recreation
  3. You can simulate squash/fixup → You understand commit combination
  4. You can detect potential issues (drops, conflicts) → You understand rebase risks

Project 4: Three-Way Merge Engine

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go, Python
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 4: Expert
  • Knowledge Area: Diff Algorithms / Merge Strategies
  • Software or Tool: Git (for comparison)
  • Main Book: The Algorithm Design Manual by Skiena

What you’ll build: A three-way merge tool that takes a base version, “ours” version, and “theirs” version of a file and produces either a merged result or conflict markers—exactly like Git does.

Why it teaches Git workflows: Every merge, rebase, and cherry-pick uses three-way merge internally. Understanding this algorithm explains why certain changes conflict and others don’t, and why merge is generally safer than manual patching.

Core challenges you’ll face:

  • Implementing diff (longest common subsequence) → maps to understanding how changes are detected
  • Handling non-conflicting parallel changes → maps to understanding automatic merge
  • Detecting and marking conflicts → maps to understanding merge conflict format
  • Choosing between merge strategies → maps to understanding recursive vs. resolve vs. octopus

Key Concepts:

  • Longest Common Subsequence: The Algorithm Design Manual Ch. 8 — Skiena
  • diff algorithm: “An O(ND) Difference Algorithm” — Eugene Myers
  • Three-way merge: Pro Git Ch. 3.2 — Chacon

Difficulty: Expert Time estimate: 2-4 weeks Prerequisites: Projects 1-3 completed, dynamic programming, understanding of diff


Real World Outcome

You’ll have a merge tool that can combine file versions exactly like Git:

Example Output:

$ ./merge3 base.txt ours.txt theirs.txt

=== Three-Way Merge ===

Base version:
1: Hello World
2: This is a test
3: Goodbye

Ours version (changes on our branch):
1: Hello World
2: This is a test
3: This is our change
4: Goodbye

Theirs version (changes on their branch):
1: Hello World
2: Their modification here
3: This is a test
4: Goodbye

=== Diff Analysis ===
Line 2: THEIRS modified (base→theirs differs, base=ours)
Line 3: OURS added (ours has extra line)

=== Merge Result (no conflicts!) ===
1: Hello World
2: Their modification here
3: This is a test
4: This is our change
5: Goodbye

$ ./merge3 base.txt ours.txt theirs.txt --conflict-case

=== CONFLICT DETECTED ===

Both modified line 2:
  Base:   "This is a test"
  Ours:   "Our version of line 2"
  Theirs: "Their version of line 2"

Merged output with conflict markers:
1: Hello World
<<<<<<< OURS
2: Our version of line 2
=======
2: Their version of line 2
>>>>>>> THEIRS
3: Goodbye

The Core Question You’re Answering

“How does Git know when changes can be automatically merged and when they conflict?”

Before you write any code, sit with this question. The key insight is the BASE version—Git doesn’t just compare two files, it compares both to their common ancestor. If only one side changed a line, that change can be applied automatically.


Concepts You Must Understand First

Stop and research these before coding:

  1. Longest Common Subsequence (LCS)
    • What’s the difference between LCS and longest common substring?
    • How does dynamic programming solve LCS in O(mn) time?
    • How does LCS relate to computing diffs?
    • Book Reference: “The Algorithm Design Manual” Ch. 8 — Skiena
  2. The Diff Algorithm
    • How does Myers’ diff algorithm work?
    • What’s an edit script?
    • How do you go from LCS to a list of insertions/deletions?
    • Paper: “An O(ND) Difference Algorithm” — Eugene Myers
  3. Three-Way Merge Logic
    • What are the four possible states of a line (unchanged, ours-only, theirs-only, both)?
    • When is a change non-conflicting?
    • What’s the format of Git’s conflict markers?
    • Book Reference: “Pro Git” Ch. 3.2 — Chacon

Questions to Guide Your Design

Before implementing, think through these:

  1. Diff Representation
    • How will you represent a diff? As edit operations? As hunks?
    • How do you handle lines that moved (not just added/deleted)?
    • Should you diff by lines or by characters within lines?
  2. Merge Algorithm
    • How do you align the three versions?
    • What if ours and theirs made the same change?
    • What if ours deleted a line that theirs modified?
  3. Conflict Handling
    • How do you represent the conflict region?
    • Should you include context lines?
    • How do you handle nested conflicts?

Thinking Exercise

Trace a Merge Manually

Set up a conflict scenario:

git init merge-test && cd merge-test
echo -e "line1\nline2\nline3" > file.txt
git add . && git commit -m "Base"
git checkout -b feature
echo -e "line1\nfeature-line2\nline3" > file.txt
git commit -am "Feature change"
git checkout main
echo -e "line1\nmain-line2\nline3" > file.txt
git commit -am "Main change"
git merge feature  # Will conflict!
cat file.txt  # See conflict markers

Questions while tracing:

  • Draw out base, ours, theirs for line 2
  • Why did Git detect a conflict?
  • What if only one branch had changed line 2?
  • Look at .git/MERGE_HEAD — what’s stored there?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Explain the three-way merge algorithm. What is the ‘base’ and why is it important?”
  2. “What’s the time complexity of computing a diff between two files?”
  3. “Why might git merge succeed when manual file comparison would suggest a conflict?”
  4. “What merge strategies does Git support, and when would you use each?”
  5. “How would you resolve a merge conflict where both sides made the same change?”

Hints in Layers

Hint 1: Starting Point Implement diff first. The simplest approach: compute LCS, then derive insertions/deletions from what’s NOT in the LCS.

Hint 2: LCS Algorithm Use dynamic programming. Build a table where lcs[i][j] = length of LCS of first i lines of A and first j lines of B. Backtrack to find the actual sequence.

Hint 3: Three-Way Logic Compute diff(base, ours) and diff(base, theirs). For each line region, categorize: unchanged, ours-only, theirs-only, or conflict.

Hint 4: Conflict Markers Git’s format:

<<<<<<< HEAD
our content
=======
their content
>>>>>>> branch-name

Books That Will Help

Topic Book Chapter
LCS algorithm “The Algorithm Design Manual” by Skiena Ch. 8
Diff algorithm “An O(ND) Difference Algorithm” by Myers Paper
Merge internals “Version Control with Git” by Loeliger Ch. 9

Implementation Hints

The LCS dynamic programming table:

       ""  l  i  n  e  1
    ""  0  0  0  0  0  0
    l   0  1  1  1  1  1
    i   0  1  2  2  2  2
    n   0  1  2  3  3  3
    ...

Three-way merge pseudocode:

for each line region:
    if base == ours == theirs:
        output(base)  # unchanged
    elif base == ours and base != theirs:
        output(theirs)  # theirs changed, use theirs
    elif base != ours and base == theirs:
        output(ours)  # ours changed, use ours
    elif ours == theirs:
        output(ours)  # same change, either is fine
    else:
        output(conflict_markers(ours, theirs))  # conflict!

Learning milestones:

  1. You can compute LCS of two sequences → You understand dynamic programming for strings
  2. You can generate a diff from LCS → You understand edit scripts
  3. You can merge non-conflicting changes → You understand three-way merge logic
  4. You can generate proper conflict markers → You understand Git’s conflict format

Project 5: Git Hooks Framework

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Bash/Python
  • Alternative Programming Languages: Go, Rust, Node.js
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Git Hooks / CI/CD
  • Software or Tool: Git
  • Main Book: Pro Git by Scott Chacon

What you’ll build: A Git hooks management system (like Husky, but from scratch) that allows configuring pre-commit, pre-push, and commit-msg hooks via a config file, with support for running multiple scripts per hook.

Why it teaches Git workflows: Hooks are how teams enforce code quality—running linters, tests, and format checks before code is committed or pushed. Understanding hooks is essential for implementing trunk-based development and code review workflows.

Core challenges you’ll face:

  • Understanding the hook lifecycle → maps to understanding when Git runs each hook
  • Managing hook installation → maps to understanding .git/hooks/ vs. tracked scripts
  • Handling hook failures → maps to understanding how hooks block operations
  • Sharing hooks across a team → maps to understanding the .git not being tracked problem

Key Concepts:

  • Git hooks: Pro Git Ch. 8.3 — Scott Chacon
  • Exit codes: Shell scripting fundamentals
  • Process execution: The Linux Programming Interface Ch. 24 — Kerrisk

Difficulty: Intermediate Time estimate: 1 week Prerequisites: Shell scripting, understanding of Git basics


Real World Outcome

You’ll have a hooks framework that your team can use:

Example Output:

$ cat .git-hooks.yaml
hooks:
  pre-commit:
    - name: "Format check"
      run: "npm run format:check"
    - name: "Lint"
      run: "npm run lint"
    - name: "Type check"
      run: "npm run typecheck"

  commit-msg:
    - name: "Conventional commit"
      run: "./scripts/check-commit-msg.sh"

  pre-push:
    - name: "Tests"
      run: "npm test"
    - name: "Build"
      run: "npm run build"

$ ./hooks-manager install
Installing hooks framework...
✓ Created .git/hooks/pre-commit
✓ Created .git/hooks/commit-msg
✓ Created .git/hooks/pre-push
Hooks installed successfully!

$ git commit -m "bad commit"
Running pre-commit hooks...
[1/3] Format check... ✓ (0.5s)
[2/3] Lint... ✗ FAILED (1.2s)

Error: ESLint found 3 errors:
  src/index.ts:15 - Unexpected any type
  src/utils.ts:8 - Missing return type
  src/utils.ts:22 - Unused variable 'temp'

Pre-commit hook failed. Commit aborted.
Fix the issues above or use --no-verify to skip hooks.

$ # Fix issues...
$ git commit -m "feat: add user authentication"
Running pre-commit hooks...
[1/3] Format check... ✓ (0.5s)
[2/3] Lint... ✓ (1.1s)
[3/3] Type check... ✓ (2.3s)

Running commit-msg hooks...
[1/1] Conventional commit... ✓ (0.1s)

[feature 3a4b5c6] feat: add user authentication
 3 files changed, 127 insertions(+)

The Core Question You’re Answering

“How do teams enforce code quality automatically, and why can’t Git hooks be shared through the repository?”

Before you write any code, sit with this question. The .git directory is not tracked by Git itself, so hooks don’t travel with the repo. This is why tools like Husky exist—to bridge tracked config files with untracked hook scripts.


Concepts You Must Understand First

Stop and research these before coding:

  1. Available Git Hooks
    • What hooks exist (pre-commit, commit-msg, pre-push, post-merge, etc.)?
    • What arguments does each hook receive?
    • What does the exit code mean for each hook?
    • Book Reference: “Pro Git” Ch. 8.3 — Chacon
  2. Hook Execution Context
    • What’s the working directory when a hook runs?
    • What environment variables are available?
    • How do you access staged changes vs. working directory?
    • Book Reference: “Pro Git” Ch. 8.3 — Chacon
  3. Exit Codes
    • How do exit codes control whether Git proceeds?
    • How do you propagate failures from child processes?
    • What exit codes should your framework use?
    • Book Reference: “The Linux Command Line” Ch. 27 — Shotts

Questions to Guide Your Design

Before implementing, think through these:

  1. Configuration
    • Where will the config file live (.git-hooks.yaml, .hooks/, package.json)?
    • How will users specify multiple scripts per hook?
    • How will you handle hook arguments and stdin?
  2. Installation
    • How will you install hooks to .git/hooks/?
    • How will you avoid overwriting user’s custom hooks?
    • How will you handle reinstallation on config changes?
  3. Execution
    • How will you run multiple scripts and aggregate results?
    • Should scripts run in parallel or serial?
    • How will you display progress and output?

Thinking Exercise

Explore Git Hooks

Set up and test hooks manually:

git init hook-test && cd hook-test
echo "initial" > file.txt && git add . && git commit -m "init"

# Create a failing pre-commit hook
cat > .git/hooks/pre-commit << 'EOF'
#!/bin/bash
echo "Pre-commit hook running..."
echo "Checking for TODO comments..."
if grep -r "TODO" .; then
    echo "ERROR: Found TODO comments!"
    exit 1
fi
echo "All clear!"
exit 0
EOF
chmod +x .git/hooks/pre-commit

# Test it
echo "// TODO: fix this" >> file.txt
git add file.txt
git commit -m "test"  # Should fail!

Questions while exploring:

  • What exit code caused the commit to fail?
  • What’s in $GIT_INDEX_FILE during the hook?
  • Try git commit --no-verify — what happens?
  • Check what’s passed via stdin to commit-msg

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “How would you enforce that all commits pass linting before being pushed?”
  2. “Why can’t you just add your hooks to .git/hooks/ and commit them?”
  3. “What’s the difference between pre-commit and pre-push hooks?”
  4. “How would you skip hooks for a work-in-progress commit?”
  5. “How do commit-msg hooks work, and how would you enforce conventional commits?”

Hints in Layers

Hint 1: Starting Point Your installed hook script should: read config file, determine which scripts to run, run them in order, and exit 0 only if all succeed.

Hint 2: Config Parsing YAML is nice for config. In bash, you might use simpler formats or shell out to Python for parsing.

Hint 3: Hook Arguments For commit-msg, argument 1 is the path to the message file. For pre-push, stdin contains lines with local/remote refs.

Hint 4: Progress Display Use ANSI colors and \r to overwrite lines. Show [1/3] Linting... then update to [1/3] Linting... ✓


Books That Will Help

Topic Book Chapter
Git hooks “Pro Git” by Chacon Ch. 8.3
Shell scripting “The Linux Command Line” by Shotts Ch. 24-27
Process management “The Linux Programming Interface” by Kerrisk Ch. 24-28

Implementation Hints

Your installed hook script pattern:

#!/bin/bash
# This file is auto-generated - do not edit

HOOK_NAME=$(basename "$0")
CONFIG_FILE=".git-hooks.yaml"

if [ ! -f "$CONFIG_FILE" ]; then
    exit 0  # No config, allow operation
fi

# Parse config, find scripts for this hook type
# Run each script in sequence
# Exit with first failure or 0 if all pass

For the installer:

  1. Read config file
  2. For each hook type with scripts, create .git/hooks/{hookname}
  3. Make each executable with chmod +x
  4. Optionally backup existing hooks

Learning milestones:

  1. You can create and trigger a simple hook → You understand hook basics
  2. You can install hooks from config → You understand the installation problem
  3. You can run multiple scripts per hook → You understand hook orchestration
  4. You can display progress and handle failures → You have a usable framework

Project 6: Trunk-Based Development Pipeline

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, Bash, TypeScript
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: CI/CD / Feature Flags / Git Workflows
  • Software or Tool: Git, GitHub Actions or similar
  • Main Book: Accelerate by Nicole Forsgren

What you’ll build: A complete trunk-based development pipeline with feature flags, automated testing on every commit, and a CLI tool that manages short-lived branches and enforces trunk-based discipline.

Why it teaches Git workflows: Trunk-based development is how high-performing teams work—Google, Facebook, and Netflix all use variations of it. By implementing the tooling yourself, you’ll understand why long-lived branches cause merge pain and how feature flags enable shipping incomplete code safely.

Core challenges you’ll face:

  • Implementing feature flags → maps to understanding how to hide incomplete features
  • Enforcing short-lived branches → maps to understanding the cost of divergence
  • Automating branch cleanup → maps to understanding branch lifecycle
  • Building CI integration → maps to understanding continuous integration

Key Concepts:

  • Trunk-based development: Accelerate Ch. 4 — Forsgren
  • Feature flags: Continuous Delivery Ch. 10 — Humble & Farley
  • CI/CD principles: The DevOps Handbook Ch. 3 — Kim et al.

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-5 completed, understanding of CI systems


Real World Outcome

You’ll have a CLI and supporting infrastructure for trunk-based development:

Example Output:

$ trunk init
Initializing trunk-based development for this repository...
✓ Created .trunk/config.yaml
✓ Created .trunk/feature-flags.json
✓ Set up branch policies in .github/settings.yaml
✓ Created GitHub Actions workflow

Trunk-based development enabled!
Main branch: main
Max branch age: 2 days
Feature flags file: .trunk/feature-flags.json

$ trunk branch create auth-improvements
Creating short-lived branch 'auth-improvements'...
✓ Branch created from main
✓ Tracking enabled (will warn if branch > 2 days)
✓ Upstream set to origin/main

Tip: Merge back to main within 2 days to stay trunk-based!

$ trunk status
=== Trunk Status ===

Main branch: main (12 commits ahead of last deploy)

Active branches:
  auth-improvements  (you)   0.5 days old  ✓ fresh
  user-profile       (alice)  1.8 days old  ⚠️ getting stale
  legacy-cleanup     (bob)    4.2 days old  ❌ STALE - violates trunk-based

Feature flags:
  new_auth_flow:    enabled for: internal, beta-users
  profile_v2:       disabled (in development)
  dark_mode:        enabled for: 10% rollout

$ trunk flag create new-checkout-flow
Created feature flag 'new_checkout_flow' (disabled by default)

Updated .trunk/feature-flags.json:
{
  "new_checkout_flow": {
    "enabled": false,
    "enabledFor": [],
    "createdAt": "2024-01-15",
    "owner": "you"
  }
}

Usage in code:
  if (isEnabled('new_checkout_flow')) {
    // new code
  }

$ trunk merge
Running pre-merge checks...
✓ Branch age: 0.5 days (ok)
✓ Tests passed
✓ No merge conflicts with main
✓ Code review approved

Squash-merging 3 commits into main...
[main abc1234] feat: improve auth flow (#127)

✓ Branch 'auth-improvements' merged and deleted
✓ Deployment triggered to staging

The Core Question You’re Answering

“Why do high-performing teams commit directly to main, and how do they ship incomplete features without breaking production?”

Before you write any code, sit with this question. The answer is feature flags plus CI/CD. Incomplete code goes to production but is hidden behind flags. This eliminates merge hell and enables true continuous integration.


Concepts You Must Understand First

Stop and research these before coding:

  1. Trunk-Based Development
    • What defines trunk-based development vs. GitFlow?
    • Why are short-lived branches (< 2 days) important?
    • How do you handle work that takes longer than 2 days?
    • Book Reference: “Accelerate” Ch. 4 — Forsgren
  2. Feature Flags
    • What’s the difference between release flags and experiment flags?
    • How do you gradually roll out a feature (canary deployment)?
    • What’s the lifecycle of a feature flag?
    • Book Reference: “Continuous Delivery” Ch. 10 — Humble & Farley
  3. Continuous Integration
    • What’s the difference between CI and continuous delivery?
    • Why must you build on every commit in trunk-based?
    • How do you handle flaky tests?
    • Book Reference: “The DevOps Handbook” Ch. 3 — Kim et al.

Questions to Guide Your Design

Before implementing, think through these:

  1. Branch Policies
    • How will you track branch age?
    • What should happen when a branch exceeds the limit?
    • How do you handle exceptions (releases, hotfixes)?
  2. Feature Flags
    • Where should flags be stored (code, config, external service)?
    • How do you handle flag evaluation at runtime?
    • How do you clean up old flags?
  3. CI Integration
    • What workflows need to run on each commit?
    • How do you handle test failures on main?
    • How do you integrate with existing CI systems?

Thinking Exercise

Simulate a Trunk-Based Sprint

Plan how you’d implement a feature trunk-based:

Feature: Add password strength indicator to signup

Day 1: Create branch, add strength calculation logic (behind flag)
       Commit to main (hidden behind flag, passes tests)

Day 2: Add UI component (behind flag), deploy to staging
       Internal QA tests with flag enabled

Day 3: Enable for beta users, collect feedback

Day 4: Fix issues based on feedback, new commits to main

Day 5: Enable for 25% of users

Week 2: 100% rollout, remove feature flag, delete old code

Questions while planning:

  • Where are the merge conflicts? (Answer: nowhere!)
  • What if you find a bug during rollout?
  • What if the feature needs to be reverted?
  • How long was the branch alive? (1-2 days per micro-feature)

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Explain trunk-based development and why it reduces integration problems.”
  2. “How would you implement a feature flag system from scratch?”
  3. “What’s the difference between a feature flag and a configuration setting?”
  4. “How do you handle database migrations in trunk-based development?”
  5. “What testing strategies are essential for trunk-based development?”

Hints in Layers

Hint 1: Starting Point Start with the branch age tracker. Use git log -1 --format=%ct to get the creation timestamp. Store branch metadata in .trunk/.

Hint 2: Feature Flags A simple JSON file works for small teams. For runtime, load the JSON and expose an isEnabled(flagName, context) function.

Hint 3: CI Integration Generate a GitHub Actions workflow that runs on push to main. Use the on: push trigger with proper caching for speed.

Hint 4: Merge Tooling Your trunk merge should: check age, run tests locally, squash commits, push, and delete the remote branch.


Books That Will Help

Topic Book Chapter
Trunk-based development “Accelerate” by Forsgren Ch. 4
Feature flags “Continuous Delivery” by Humble & Farley Ch. 10
DevOps practices “The DevOps Handbook” by Kim et al. Ch. 3-5

Implementation Hints

Feature flag schema:

{
  "flag_name": {
    "enabled": false,
    "enabledFor": ["user_123", "beta-testers"],
    "percentage": 0,
    "createdAt": "2024-01-15",
    "owner": "alice@company.com"
  }
}

Branch age tracking:

# Get branch creation time (approximate - first commit on branch not on main)
git log main..HEAD --format=%ct --reverse | head -1

Learning milestones:

  1. You can track branch age and warn on staleness → You understand branch discipline
  2. You can manage feature flags → You understand decoupling deploy from release
  3. You can integrate with CI → You understand continuous integration
  4. You can enforce policies automatically → You have a complete trunk-based setup

Project 7: Code Review Bot

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Go, TypeScript, Rust
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: GitHub API / Static Analysis / Automation
  • Software or Tool: GitHub, Git
  • Main Book: Working Effectively with Legacy Code by Feathers

What you’ll build: An automated code review bot that comments on pull requests with actionable feedback—detecting large diffs, missing tests, style violations, and common anti-patterns.

Why it teaches Git workflows: Professional code review is the gatekeeper of code quality. By building a bot that performs automated reviews, you’ll understand what makes PRs easy or hard to review, why smaller PRs get approved faster, and how to structure changes for maximum reviewability.

Core challenges you’ll face:

  • Accessing the GitHub API → maps to understanding how PR tools work
  • Analyzing diffs programmatically → maps to understanding what reviewers look for
  • Providing actionable feedback → maps to understanding effective code review
  • Handling edge cases → maps to understanding real-world complexity

Key Concepts:

  • GitHub API: GitHub REST/GraphQL API documentation
  • Static analysis: Working Effectively with Legacy Code Ch. 13 — Feathers
  • Code review best practices: The Pragmatic Programmer Ch. 7 — Hunt & Thomas

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: API integration experience, understanding of code analysis


Real World Outcome

You’ll have a bot that automatically reviews PRs:

Example Output:

=== Code Review Bot Report ===
PR #127: "Add user authentication"
Author: alice
Files changed: 8
Lines added: 347
Lines removed: 12

📊 SIZE ANALYSIS
⚠️ This PR is LARGE (347 lines added)
   Consider splitting into smaller PRs for easier review.
   Suggested split:
   - auth/login.ts, auth/logout.ts (auth logic)
   - components/LoginForm.tsx (UI)
   - tests/* (test files)

📝 DIFF ANALYSIS
src/auth/login.ts:
  Line 45: ⚠️ TODO comment found: "TODO: add rate limiting"
           Consider creating an issue instead of a TODO.

  Line 78: ⚠️ Hardcoded string "admin" detected
           Consider using a constant or environment variable.

  Line 112: ⚠️ No error handling for await expression
            Consider wrapping in try/catch.

🧪 TEST COVERAGE
⚠️ New code in src/auth/ but no new tests in tests/auth/
   Added files without corresponding tests:
   - src/auth/login.ts
   - src/auth/session.ts

   Consider adding tests to maintain coverage.

📋 BEST PRACTICES
✓ Conventional commit message format
✓ No merge commits in PR
✓ Description includes context
⚠️ Missing "Testing" section in PR description

💬 AUTO-COMMENT POSTED TO PR:
"Thanks for the PR! I've found a few things to address:
- PR is large (347 lines) - consider splitting
- 2 TODO comments should be converted to issues
- Missing tests for new auth logic
- Please add a 'Testing' section to the description

See full analysis above. Happy to help if you have questions!"

The Core Question You’re Answering

“What makes code review effective, and how can automation enforce best practices without blocking legitimate work?”

Before you write any code, sit with this question. Good code review catches bugs, shares knowledge, and maintains quality—but bad code review is a bottleneck. Automation should handle the mechanical checks so humans can focus on design and logic.


Concepts You Must Understand First

Stop and research these before coding:

  1. GitHub Pull Request API
    • How do you fetch PR details, files, and diff?
    • How do you post comments on specific lines?
    • What’s the difference between issue comments and review comments?
    • Resource: GitHub REST API documentation
  2. Effective Code Review
    • What makes a PR easy to review?
    • What’s the ideal PR size?
    • What should humans review vs. what should be automated?
    • Book Reference: “The Pragmatic Programmer” Ch. 7 — Hunt & Thomas
  3. Static Analysis
    • What patterns indicate potential bugs?
    • How do you detect missing test coverage?
    • What anti-patterns are machine-detectable?
    • Book Reference: “Working Effectively with Legacy Code” Ch. 13 — Feathers

Questions to Guide Your Design

Before implementing, think through these:

  1. Trigger Mechanism
    • How will the bot be triggered (webhook, polling, manual)?
    • How do you authenticate with GitHub?
    • How do you handle rate limiting?
  2. Analysis Types
    • What checks are universal vs. project-specific?
    • How do you configure checks per repository?
    • How do you handle false positives?
  3. Feedback Delivery
    • Should you comment on individual lines or summarize?
    • How do you avoid being annoying (comment spam)?
    • How do you handle re-reviews after changes?

Thinking Exercise

Analyze a Real PR

Find a PR on a popular open source project and analyze it:

  1. Open a PR with 20+ files changed
  2. For each file, note what you’d want automated checking for
  3. Identify patterns that humans shouldn’t have to catch manually

Questions while analyzing:

  • How long did this PR take to get reviewed?
  • Did reviewers comment on things a bot could catch?
  • Would you have approved this PR as-is?
  • What would make this PR easier to review?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “What makes a pull request easy or hard to review?”
  2. “How would you design a system to automatically assign code reviewers?”
  3. “What’s the tradeoff between thorough automated checks and developer velocity?”
  4. “How would you handle a bot that generates too many false positives?”
  5. “What code review aspects should remain human-only?”

Hints in Layers

Hint 1: Starting Point Use PyGithub or the REST API directly. Start with fetching PR details and printing file names.

Hint 2: Diff Analysis The GitHub API returns diff hunks. Parse the @@ line numbers to know where changes occurred.

Hint 3: Pattern Detection Regular expressions work for simple patterns (TODO, FIXME, hardcoded strings). For language-specific analysis, consider AST parsing.

Hint 4: Commenting Use the “pull request review” API to batch comments. Individual line comments go in comments, overall feedback goes in body.


Books That Will Help

Topic Book Chapter
Code smells “Working Effectively with Legacy Code” by Feathers Ch. 13-16
Review best practices “The Pragmatic Programmer” by Hunt & Thomas Ch. 7
API design “Build APIs You Won’t Hate” by Sturgeon Ch. 4-6

Implementation Hints

GitHub API patterns:

# Fetch PR
pr = repo.get_pull(pr_number)
files = pr.get_files()

for file in files:
    # file.filename, file.patch, file.additions, file.deletions
    analyze_diff(file.patch)

# Post review
pr.create_review(
    body="Overall feedback here",
    event="COMMENT",  # or "APPROVE" or "REQUEST_CHANGES"
    comments=[
        {"path": "src/file.ts", "line": 45, "body": "Consider..."}
    ]
)

Common checks to implement:

  • PR size (files, lines changed)
  • TODO/FIXME comments
  • Hardcoded credentials/secrets
  • Missing test files
  • Console.log / print statements
  • Long functions
  • Deep nesting

Learning milestones:

  1. You can fetch and parse PR data → You understand the GitHub API
  2. You can detect common issues in diffs → You understand static analysis
  3. You can post meaningful comments → You understand review communication
  4. Your bot helps without annoying → You understand the balance

Project 8: Monorepo Task Runner

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Rust
  • Alternative Programming Languages: Go, TypeScript, Python
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 4: Expert
  • Knowledge Area: Build Systems / Graph Algorithms / Caching
  • Software or Tool: Git
  • Main Book: Software Engineering at Google by Winters et al.

What you’ll build: A monorepo task runner (like a mini Turborepo or Nx) that detects which packages changed, runs only affected tests, and caches results to avoid redundant work.

Why it teaches Git workflows: Monorepos are how Google, Microsoft, and many large companies organize code. The challenges—scale, affected detection, incremental builds—teach you how Git can be used as more than version control; it becomes the source of truth for what changed.

Core challenges you’ll face:

  • Detecting affected packages → maps to understanding dependency graphs + git diff
  • Implementing task caching → maps to understanding content-addressable storage
  • Parallel task execution → maps to understanding topological order in DAGs
  • Handling workspace dependencies → maps to understanding package relationships

Key Concepts:

  • Monorepo architecture: Software Engineering at Google Ch. 16-18 — Winters et al.
  • Build caching: Bazel documentation on remote caching
  • Graph algorithms: Grokking Algorithms Ch. 6 — Bhargava

Difficulty: Expert Time estimate: 1 month+ Prerequisites: Projects 1-7 completed, graph algorithms, understanding of build systems


Real World Outcome

You’ll have a task runner that makes monorepos manageable:

Example Output:

$ mono status
=== Monorepo Status ===

Packages (5):
  packages/core       - library, no changes
  packages/utils      - library, 2 files changed
  packages/api        - app, depends on core, utils
  packages/web        - app, depends on core, utils
  packages/cli        - app, depends on core

Dependency graph:
  api ──→ core
    └───→ utils
  web ──→ core
    └───→ utils
  cli ──→ core

$ mono affected --base=main
Analyzing changes from main...

Changed files:
  packages/utils/src/string.ts (+12 -3)
  packages/utils/src/date.ts (+5 -2)

Affected packages (3):
  packages/utils      - directly changed
  packages/api        - depends on utils
  packages/web        - depends on utils

Unaffected packages (2):
  packages/core       - no dependency on changed files
  packages/cli        - no dependency on changed files

$ mono test --affected
Running tests for affected packages...

[1/3] Testing utils...
  Using cached result from abc123 (12 tests, 0.8s ago)
  ✓ Skipped (cache hit)

Wait, utils changed! Invalidating cache...
  Running 12 tests...
  ✓ 12 passed (2.3s)
  Cache stored: def456

[2/3] Testing api...
  Dependency utils changed, cache invalidated
  Running 47 tests...
  ✓ 47 passed (8.1s)
  Cache stored: ghi789

[3/3] Testing web...
  Dependency utils changed, cache invalidated
  Running 83 tests...
  ✓ 83 passed (12.4s)
  Cache stored: jkl012

Summary:
  3 packages tested
  142 tests passed
  Total time: 22.8s (without cache: ~45s)
  Cache hit rate: 0% (invalidated by changes)

$ mono test --affected  # Run again, nothing changed
All 3 affected packages have valid cache entries.
✓ Nothing to run (22.8s saved)

The Core Question You’re Answering

“How do you build and test only what changed in a codebase with hundreds of packages?”

Before you write any code, sit with this question. The answer combines Git diff to know what changed, a dependency graph to know what’s affected, and content-addressable caching to skip redundant work.


Concepts You Must Understand First

Stop and research these before coding:

  1. Package Dependency Graphs
    • How do you model dependencies between packages?
    • What’s the difference between dependencies and devDependencies?
    • How do you detect circular dependencies?
    • Book Reference: “Grokking Algorithms” Ch. 6 — Bhargava
  2. Affected Detection
    • How do you use git diff to find changed files?
    • How do you map files to packages?
    • How do you propagate “affected” through the dependency graph?
    • Book Reference: “Software Engineering at Google” Ch. 17 — Winters et al.
  3. Task Caching
    • What inputs determine a task’s cache key?
    • How do you store and retrieve cache entries?
    • When is it safe to reuse a cached result?
    • Resource: Turborepo documentation on caching

Questions to Guide Your Design

Before implementing, think through these:

  1. Package Discovery
    • How will you find packages in the repo (package.json, Cargo.toml, etc.)?
    • How will you extract dependencies?
    • How will you handle different package managers?
  2. Cache Key Calculation
    • What inputs affect a task’s output (source files, dependencies, config)?
    • How do you hash all these inputs efficiently?
    • Should the cache be local, remote, or both?
  3. Task Orchestration
    • How do you respect dependency order (topological sort)?
    • How do you parallelize independent tasks?
    • How do you handle task failures?

Thinking Exercise

Trace a Change Through Dependencies

Map out a monorepo change manually:

packages/
  shared-utils/    (dependency of everything)
  auth-service/    (depends on: shared-utils)
  user-api/        (depends on: shared-utils, auth-service)
  web-app/         (depends on: shared-utils, user-api)
  cli-tool/        (depends on: shared-utils)

Change: Edit packages/shared-utils/src/format.ts

Questions while tracing:

  • Which packages need to be rebuilt?
  • In what order should they be rebuilt?
  • If you had cached builds from yesterday, which caches are now invalid?
  • How many packages could you build in parallel?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “How would you design a build system for a monorepo with 100 packages?”
  2. “Explain how you would calculate a cache key for a build task.”
  3. “What’s the time complexity of detecting affected packages?”
  4. “How do you handle diamond dependencies in a monorepo?”
  5. “What are the tradeoffs between monorepos and polyrepos?”

Hints in Layers

Hint 1: Starting Point Start with package discovery. Glob for package.json files, parse them, extract dependencies.

Hint 2: Dependency Graph Build an adjacency list where graph[pkg] = list of packages that depend on pkg. For affected detection, traverse from changed packages.

Hint 3: Cache Key Compute: hash(source_files_hash + dependencies_cache_keys + config_hash). If any input changes, the cache is invalid.

Hint 4: Execution Order Use Kahn’s algorithm for topological sort. Build a queue of packages with no pending dependencies; process and add newly unblocked packages.


Books That Will Help

Topic Book Chapter
Monorepo at scale “Software Engineering at Google” by Winters et al. Ch. 16-18
Graph algorithms “Grokking Algorithms” by Bhargava Ch. 6
Build system design “The Bazel Book” (online) Ch. 1-3

Implementation Hints

Cache key calculation:

def compute_cache_key(package):
    hasher = hashlib.sha256()

    # Hash source files
    for file in get_source_files(package):
        hasher.update(file_hash(file))

    # Hash dependencies' cache keys (transitive)
    for dep in get_dependencies(package):
        hasher.update(compute_cache_key(dep))

    # Hash config
    hasher.update(config_hash(package))

    return hasher.hexdigest()

Affected detection:

def get_affected(changed_files, base_ref):
    # Map files to packages
    changed_packages = set()
    for file in changed_files:
        pkg = get_package_for_file(file)
        if pkg:
            changed_packages.add(pkg)

    # Find all dependents (reverse dependency graph)
    affected = set(changed_packages)
    queue = list(changed_packages)
    while queue:
        pkg = queue.pop(0)
        for dependent in get_dependents(pkg):
            if dependent not in affected:
                affected.add(dependent)
                queue.append(dependent)

    return affected

Learning milestones:

  1. You can discover packages and build a dependency graph → You understand monorepo structure
  2. You can detect affected packages from git diff → You understand change propagation
  3. You can compute and use cache keys → You understand incremental builds
  4. You can run tasks in topological order → You have a working task runner

Project 9: Git Bisect Automator

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: Bash, Go, Rust
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Binary Search / Git Bisect / Debugging
  • Software or Tool: Git
  • Main Book: Pro Git by Scott Chacon

What you’ll build: A tool that wraps git bisect to automatically find the commit that introduced a bug by running a test script, with support for skip detection, performance optimizations, and detailed reporting.

Why it teaches Git workflows: Bisect is Git’s killer debugging feature—it uses binary search to find the exact commit that broke something. By building a wrapper, you’ll understand how bisect works internally and how to leverage it effectively in real debugging scenarios.

Core challenges you’ll face:

  • Driving git bisect programmatically → maps to understanding bisect’s state machine
  • Handling untestable commits → maps to understanding git bisect skip
  • Detecting flaky tests → maps to understanding real-world complexity
  • Optimizing search → maps to understanding binary search on DAGs

Key Concepts:

  • Git bisect: Pro Git Ch. 7.10 — Scott Chacon
  • Binary search: Algorithms Ch. 1 — Sedgewick & Wayne
  • Test automation: Continuous Delivery Ch. 8 — Humble & Farley

Difficulty: Intermediate Time estimate: 1 week Prerequisites: Basic Git, understanding of binary search


Real World Outcome

You’ll have a tool that makes bisect easy and informative:

Example Output:

$ auto-bisect --good v1.0.0 --bad HEAD --test "npm test"

=== Auto Bisect ===
Good: v1.0.0 (abc123)
Bad:  HEAD (xyz789)

Calculating search space...
  Commits between good and bad: 127
  Expected bisect steps: ~7 (log₂(127) = 6.99)

Starting automated bisect...

Step 1/7: Testing commit def456 "Add user profile feature"
  Running: npm test
  Result: GOOD (tests pass)
  Search space: 127 → 63 commits remaining

Step 2/7: Testing commit ghi789 "Refactor authentication"
  Running: npm test
  Result: BAD (tests fail)
  Search space: 63 → 31 commits remaining

Step 3/7: Testing commit jkl012 "Update dependencies"
  Running: npm test
  Result: SKIP (build failed, can't test)
  Skipping this commit, trying adjacent...

  Testing commit jkl011 "Fix linting errors"
  Result: GOOD (tests pass)
  Search space: 31 → 15 commits remaining

... (steps 4-7)

Step 7/7: Testing commit mno345 "Fix login redirect"
  Running: npm test
  Result: BAD (tests fail)

=== BISECT COMPLETE ===

First bad commit: mno345
Author: Alice <alice@example.com>
Date:   2024-01-12 14:32:00

    Fix login redirect

    Changed the redirect URL after successful login
    to use relative paths instead of absolute.

Changed files:
  src/auth/login.ts (+5 -3)
  src/routes/index.ts (+2 -1)

This commit likely introduced the bug!

Suggestion: Check the changes to src/auth/login.ts lines 45-52

$ auto-bisect --log
Previous bisect sessions:
  2024-01-15: Found mno345 (7 steps, 2m 30s)
  2024-01-10: Found abc123 (5 steps, 1m 45s)

The Core Question You’re Answering

“How does binary search apply to debugging, and how does Git leverage the commit graph for bisect?”

Before you write any code, sit with this question. Bisect works because Git history is ordered (parent relationships). Given a known-good and known-bad commit, you can binary search through the DAG to find where things went wrong.


Concepts You Must Understand First

Stop and research these before coding:

  1. Binary Search on DAGs
    • How does bisect work when history isn’t linear?
    • How does Git choose the midpoint in a merge-heavy history?
    • What’s the worst-case complexity?
    • Book Reference: “Pro Git” Ch. 7.10 — Chacon
  2. Git Bisect State
    • Where does Git store bisect state?
    • What are the bisect commands (start, good, bad, skip, reset)?
    • How do you automate bisect with git bisect run?
    • Book Reference: “Pro Git” Ch. 7.10 — Chacon
  3. Test Reliability
    • What makes a test suitable for bisecting?
    • How do you handle commits that can’t be tested (build failures)?
    • How do you detect and handle flaky tests?
    • Book Reference: “Continuous Delivery” Ch. 8 — Humble & Farley

Questions to Guide Your Design

Before implementing, think through these:

  1. Test Script Interface
    • What exit codes should the test script use (0=good, 1-124=bad, 125=skip)?
    • How do you handle timeouts?
    • How do you capture and display test output?
  2. Bisect Optimization
    • How can you speed up bisect (parallel builds, caching)?
    • How do you minimize checkout operations?
    • Can you pre-compute which commits are skippable?
  3. Reporting
    • What information is most useful when bisect completes?
    • How do you present the journey (steps taken)?
    • How do you suggest next debugging steps?

Thinking Exercise

Walk Through Bisect Manually

Simulate bisect on paper:

Commit history: A ← B ← C ← D ← E ← F ← G ← H
                good              BAD           bad

Start: good=A, bad=H (8 commits)

Questions while walking through:

  • Step 1: Which commit does Git test first (midpoint)?
  • If midpoint is BAD, what’s the new search range?
  • If midpoint is GOOD, what’s the new search range?
  • How many steps maximum to find the first bad commit?
  • What if commit D can’t be built (skip)?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Explain how git bisect works internally.”
  2. “What’s the time complexity of git bisect?”
  3. “How would you handle a situation where bisect identifies a merge commit as bad?”
  4. “What makes a good test script for automated bisect?”
  5. “How would you bisect a performance regression (not a pass/fail test)?”

Hints in Layers

Hint 1: Starting Point Use git bisect run ./test-script.sh. Exit code 0 = good, 1-124 = bad, 125 = skip, 126-127 = abort.

Hint 2: Parsing Output Capture bisect output to track progress. Look for “Bisecting:” lines to know which commit is being tested.

Hint 3: Enhanced Reporting After bisect completes, use git show --stat <bad-commit> to show what files changed.

Hint 4: Flaky Detection Run the test multiple times at a commit. If results are inconsistent, mark as flaky and skip.


Books That Will Help

Topic Book Chapter
Git bisect “Pro Git” by Chacon Ch. 7.10
Binary search “Algorithms” by Sedgewick Ch. 1
Test reliability “Continuous Delivery” by Humble & Farley Ch. 8

Implementation Hints

Basic automated bisect wrapper:

def auto_bisect(good, bad, test_cmd):
    # Initialize
    subprocess.run(["git", "bisect", "start"])
    subprocess.run(["git", "bisect", "bad", bad])
    subprocess.run(["git", "bisect", "good", good])

    while True:
        # Get current commit being tested
        result = subprocess.run(
            ["git", "rev-parse", "HEAD"],
            capture_output=True
        )
        current = result.stdout.decode().strip()

        # Run test
        test_result = subprocess.run(test_cmd, shell=True)

        if test_result.returncode == 0:
            mark = "good"
        elif test_result.returncode < 125:
            mark = "bad"
        else:
            mark = "skip"

        # Tell bisect the result
        result = subprocess.run(
            ["git", "bisect", mark],
            capture_output=True
        )

        # Check if done
        if "is the first bad commit" in result.stdout.decode():
            break

    # Cleanup
    subprocess.run(["git", "bisect", "reset"])

Learning milestones:

  1. You can run basic git bisect manually → You understand bisect concepts
  2. You can automate bisect with a test script → You understand git bisect run
  3. You can handle skip cases gracefully → You understand real-world complexity
  4. You can generate useful reports → You have a debugging power tool

Project 10: Stacked PRs Manager

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Python, Rust, TypeScript
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Git Rebase / GitHub API / Workflow Automation
  • Software or Tool: Git, GitHub
  • Main Book: Pro Git by Scott Chacon

What you’ll build: A tool for managing stacked (dependent) pull requests—where PR 2 depends on PR 1, PR 3 depends on PR 2, etc. The tool handles rebasing the stack when upstream changes, updating PR descriptions with dependency info, and orchestrating merges in order.

Why it teaches Git workflows: Stacked PRs are essential for large features that need to be reviewed incrementally. Managing them manually is error-prone; this project teaches you the rebase mechanics and the orchestration needed to keep a chain of branches synchronized.

Core challenges you’ll face:

  • Tracking stack relationships → maps to understanding branch dependencies
  • Cascading rebases → maps to understanding how rebasing affects dependents
  • Updating PR descriptions → maps to understanding GitHub API and automation
  • Orchestrating merges → maps to understanding ordered merge operations

Key Concepts:

  • Rebase mechanics: Pro Git Ch. 3.6, 7.6 — Scott Chacon
  • Dependent branches: Graphite/Stacked PRs concepts
  • GitHub API: GitHub REST API documentation

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-7 completed, strong rebase understanding


Real World Outcome

You’ll have a CLI for managing PR stacks:

Example Output:

$ stack create auth-refactor
Created stack 'auth-refactor' based on main

$ stack branch add-user-model
Created branch 'auth-refactor/add-user-model' in stack
# ... make commits ...

$ stack branch add-login-endpoint
Created branch 'auth-refactor/add-login-endpoint' on top of 'add-user-model'
# ... make commits ...

$ stack branch add-login-ui
Created branch 'auth-refactor/add-login-ui' on top of 'add-login-endpoint'
# ... make commits ...

$ stack status
Stack: auth-refactor (based on main)

  main
    └── add-user-model      (3 commits, PR #127)
        └── add-login-endpoint  (5 commits, PR #128)
            └── add-login-ui    (4 commits, PR #129)  ← HEAD

All branches up to date with their bases.

$ stack push
Pushing stack to origin...
✓ Pushed auth-refactor/add-user-model
✓ Pushed auth-refactor/add-login-endpoint
✓ Pushed auth-refactor/add-login-ui

Creating/updating PRs...
✓ PR #127: Add user model
  Base: main
  Description updated with stack info

✓ PR #128: Add login endpoint
  Base: auth-refactor/add-user-model
  Description updated with stack info:
  "⬆️ Depends on: #127 (Add user model)
   ⬇️ Blocks: #129 (Add login UI)"

✓ PR #129: Add login UI
  Base: auth-refactor/add-login-endpoint
  Description updated with stack info:
  "⬆️ Depends on: #128 (Add login endpoint)"

$ # Someone merges PR #127...

$ stack sync
Syncing stack with origin...

Fetching updates...
PR #127 was merged to main!

Rebasing stack...
  Rebasing add-login-endpoint onto main...
  ✓ Rebased successfully (was based on add-user-model)

  Rebasing add-login-ui onto add-login-endpoint...
  ✓ Rebased successfully

Updating PRs...
✓ PR #128: Base changed to main (was add-user-model)
✓ PR #129: No changes needed

Stack synced! Ready for more merges.

The Core Question You’re Answering

“How do you break large features into reviewable chunks while keeping them synchronized?”

Before you write any code, sit with this question. Large PRs are hard to review, but splitting a feature into dependent PRs creates a maintenance burden. The solution is tooling that automates the cascade.


Concepts You Must Understand First

Stop and research these before coding:

  1. Branch Dependencies
    • How do you model “branch B is based on branch A”?
    • What happens to B when A gets new commits?
    • What happens when A is rebased or merged?
    • Book Reference: “Pro Git” Ch. 3.6 — Chacon
  2. Cascading Rebase
    • How do you rebase a chain of branches?
    • What order should you rebase in?
    • How do you handle conflicts in the middle of a chain?
    • Book Reference: “Pro Git” Ch. 7.6 — Chacon
  3. PR Base Branches
    • How do you set a PR’s base branch to another branch (not main)?
    • What happens to a PR when its base branch is merged?
    • How do you update a PR’s base branch via API?
    • Resource: GitHub REST API documentation

Questions to Guide Your Design

Before implementing, think through these:

  1. Stack Representation
    • How will you store the stack metadata (which branch is on top of which)?
    • Should this be in .git, a config file, or derived from branch names?
    • How do you handle branches that are part of multiple stacks?
  2. Sync Algorithm
    • When main changes, how do you detect which stacks need updating?
    • When a PR is merged, how do you update the stack?
    • What if a rebase has conflicts?
  3. PR Management
    • How do you set up a PR with a non-main base branch?
    • How do you generate the “depends on/blocks” description?
    • How do you keep PR descriptions in sync with stack state?

Thinking Exercise

Trace a Stack Update

Simulate a stack update on paper:

Initial state:
  main ← A ← B (add-user-model)
              └← C ← D (add-login-endpoint)
                      └← E ← F (add-login-ui)

Action: Main gets new commit X
  main ← X

Desired state:
  main ← X ← A' ← B' (add-user-model)
                   └← C' ← D' (add-login-endpoint)
                            └← E' ← F' (add-login-ui)

Questions while tracing:

  • In what order do you rebase the branches?
  • What commands do you run for each rebase?
  • Why are A’, B’, C’, etc. different commits (different SHAs)?
  • What if commit C conflicts with commit X?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “How would you design a system for managing dependent pull requests?”
  2. “What happens when a branch in the middle of a stack gets merged?”
  3. “How do you handle rebase conflicts in a stack of branches?”
  4. “What are the tradeoffs between stacked PRs and a single large PR?”
  5. “How would you implement ‘stacking’ without special tooling?”

Hints in Layers

Hint 1: Starting Point Store stack metadata in .git/stack-info/. For each stack, record the ordered list of branches.

Hint 2: Rebase Order Always rebase from bottom of stack to top. If you rebase the top first, you’ll have to redo it when you rebase its base.

Hint 3: Detecting Merges Check if a branch’s base exists on the remote. If the base is merged into main, main becomes the new base.

Hint 4: PR Updates Use gh api or PyGithub to update PR base branches. When base is merged, GitHub automatically retargets to main.


Books That Will Help

Topic Book Chapter
Rebase mechanics “Pro Git” by Chacon Ch. 3.6, 7.6
Branch workflows “Pro Git” by Chacon Ch. 5.1-5.3
GitHub CLI gh documentation

Implementation Hints

Stack metadata format:

# .git/stack-info/auth-refactor.yaml
name: auth-refactor
base: main
branches:
  - name: add-user-model
    pr: 127
  - name: add-login-endpoint
    pr: 128
  - name: add-login-ui
    pr: 129

Cascade rebase algorithm:

def sync_stack(stack):
    base = stack.base  # "main"

    for branch in stack.branches:
        # Rebase this branch onto its base
        subprocess.run([
            "git", "rebase",
            "--onto", base,
            f"{base}..{branch.name}"
        ])

        # This branch becomes the base for the next one
        base = branch.name

Learning milestones:

  1. You can track stack relationships → You understand branch dependencies
  2. You can cascade rebases correctly → You understand rebase mechanics deeply
  3. You can sync PRs with stack state → You understand GitHub API integration
  4. Your tool handles the merge case → You have a production-ready stacked PR tool

Project 11: Conventional Commits Enforcer

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Rust
  • Alternative Programming Languages: Go, Python, TypeScript
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Parsing / Semantic Versioning / Git Hooks
  • Software or Tool: Git
  • Main Book: Pro Git by Scott Chacon

What you’ll build: A commit message linter that enforces the Conventional Commits specification, generates changelogs automatically, and determines semantic version bumps based on commit types.

Why it teaches Git workflows: Conventional commits are the foundation of automated releases. By building the parser and enforcer yourself, you’ll understand how tools like semantic-release, commitlint, and changelog generators work—and why consistent commit messages enable powerful automation.

Core challenges you’ll face:

  • Parsing commit message format → maps to understanding the Conventional Commits spec
  • Determining version bumps → maps to understanding semantic versioning
  • Generating changelogs → maps to understanding release automation
  • Integrating with hooks → maps to understanding enforcement mechanisms

Key Concepts:

  • Conventional Commits: conventionalcommits.org specification
  • Semantic Versioning: semver.org
  • Changelog generation: keepachangelog.com

Difficulty: Intermediate Time estimate: 1 week Prerequisites: Understanding of regex, basic parsing concepts


Real World Outcome

You’ll have a complete commit linting and changelog system:

Example Output:

$ commit-lint "Add new feature"
❌ Invalid commit message

Error: Missing type prefix
Expected format: <type>[optional scope]: <description>

Valid types: feat, fix, docs, style, refactor, test, chore
Example: "feat: add new feature"

$ commit-lint "feat: add user authentication"
✓ Valid commit message

Type: feat (minor version bump)
Scope: none
Description: add user authentication
Breaking: no

$ commit-lint "fix(auth)!: handle token expiration correctly"
✓ Valid commit message

Type: fix (patch version bump)
Scope: auth
Description: handle token expiration correctly
Breaking: YES (major version bump)

$ changelog generate --from v1.2.0 --to HEAD

# Changelog

## [1.3.0] - 2024-01-15

### Features
- **auth:** add two-factor authentication (#127)
- add password strength indicator (#125)

### Bug Fixes
- **api:** fix rate limiting calculation (#130)
- handle edge case in date parsing (#128)

### Documentation
- update API reference for auth endpoints (#131)

### BREAKING CHANGES
- **auth:** token format changed, clients must update (#126)

---

$ semver suggest --from v1.2.0
Analyzing commits since v1.2.0...

Commits analyzed: 12
- feat: 4 (minor bump)
- fix: 5 (patch bump)
- docs: 2 (no bump)
- feat!: 1 (BREAKING - major bump)

Suggested next version: 2.0.0 (major bump due to breaking change)

Breaking commit:
  abc123 feat(auth)!: token format changed, clients must update

The Core Question You’re Answering

“How do you turn commit messages into automated releases and changelogs?”

Before you write any code, sit with this question. The answer is conventions. When every commit follows a pattern, machines can parse them to determine what changed, how to version it, and what to tell users.


Concepts You Must Understand First

Stop and research these before coding:

  1. Conventional Commits Specification
    • What are the required elements (type, description)?
    • What are the optional elements (scope, body, footer)?
    • How do you indicate breaking changes?
    • Resource: conventionalcommits.org
  2. Semantic Versioning
    • What do MAJOR.MINOR.PATCH mean?
    • When do you bump each number?
    • What’s a pre-release version?
    • Resource: semver.org
  3. Changelog Best Practices
    • What sections should a changelog have?
    • How do you group changes by type?
    • What makes a changelog human-readable?
    • Resource: keepachangelog.com

Questions to Guide Your Design

Before implementing, think through these:

  1. Parsing
    • How do you handle multi-line commit messages?
    • How do you extract the body vs. footers?
    • What regex pattern matches the Conventional Commits format?
  2. Version Determination
    • What if there are multiple breaking changes?
    • How do you handle version ranges (e.g., v1.0.0 to v2.0.0)?
    • What about pre-release versions?
  3. Integration
    • How do you make this work as a commit-msg hook?
    • How do you handle commits that bypass hooks?
    • Should CI also validate commit messages?

Thinking Exercise

Parse Example Commits

Analyze these commit messages:

1. feat(auth): add OAuth2 support
2. fix: resolve memory leak in worker pool
3. feat!: redesign API response format
4. docs(readme): update installation instructions
5. chore(deps): bump lodash from 4.17.20 to 4.17.21
6. refactor(core): extract validation logic

   This change moves validation into a separate module
   for better testability.

   BREAKING CHANGE: ValidationError now includes error codes
   Reviewed-by: Alice
   Refs: #123

Questions while parsing:

  • What’s the type, scope, and description for each?
  • Which commits bump which version component?
  • How do you extract the body from commit 6?
  • How do you detect the BREAKING CHANGE footer?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Why are conventional commits useful for automated releases?”
  2. “How would you parse a commit message to extract the type and scope?”
  3. “Explain semantic versioning and when you’d bump each number.”
  4. “How would you handle enforcing commit conventions in a large team?”
  5. “What’s the difference between a breaking change in the type (!) and in the footer?”

Hints in Layers

Hint 1: Starting Point The basic regex: ^(\w+)(\(.+\))?!?: (.+)$. This captures type, optional scope, and description.

Hint 2: Body and Footers Split on double newlines. First paragraph is the subject. Remaining paragraphs are body unless they match key: value or BREAKING CHANGE:.

Hint 3: Version Bump Logic BREAKING CHANGE or ! → major. feat → minor. fix → patch. Everything else → no bump.

Hint 4: Changelog Grouping Use a map: {feat: [], fix: [], docs: []}. Iterate commits, append to appropriate bucket, then render.


Books That Will Help

Topic Book Chapter
Git hooks “Pro Git” by Chacon Ch. 8.3
Regex parsing “Mastering Regular Expressions” by Friedl Ch. 2-3
Release automation “Continuous Delivery” by Humble & Farley Ch. 5

Implementation Hints

Conventional Commits parser:

struct ConventionalCommit {
    type_: String,
    scope: Option<String>,
    description: String,
    body: Option<String>,
    breaking: bool,
    footers: Vec<(String, String)>,
}

fn parse(message: &str) -> Result<ConventionalCommit, Error> {
    let lines: Vec<&str> = message.lines().collect();
    let subject = lines[0];

    // Parse subject line
    let re = Regex::new(r"^(\w+)(\(.+\))?(!)?:\s*(.+)$")?;
    let caps = re.captures(subject).ok_or("Invalid format")?;

    let type_ = caps[1].to_string();
    let scope = caps.get(2).map(|m| m.as_str().trim_matches(&['(', ')'][..]).to_string());
    let breaking = caps.get(3).is_some();
    let description = caps[4].to_string();

    // Parse body and footers from remaining lines...
}

Learning milestones:

  1. You can parse conventional commit format → You understand the specification
  2. You can determine version bumps → You understand semver
  3. You can generate changelogs → You understand release automation
  4. You can enforce via hooks → You have a complete system

Project 12: Git Worktree Manager

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Rust, Python, Bash
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Git Worktrees / Filesystem / Productivity
  • Software or Tool: Git
  • Main Book: Pro Git by Scott Chacon

What you’ll build: A TUI (text UI) tool for managing Git worktrees—allowing you to have multiple branches checked out simultaneously in separate directories, with easy creation, switching, and cleanup.

Why it teaches Git workflows: Worktrees are Git’s hidden superpower for working on multiple branches simultaneously. By building a manager, you’ll understand how worktrees relate to the .git directory, when to use them vs. stash, and how they enable parallel development workflows.

Core challenges you’ll face:

  • Understanding worktree mechanics → maps to understanding .git and working directory separation
  • Managing multiple checkouts → maps to understanding branch locking
  • Building a usable TUI → maps to understanding developer tooling UX
  • Cleaning up orphaned worktrees → maps to understanding Git garbage collection

Key Concepts:

  • Git worktrees: Pro Git Ch. 7.11 — Scott Chacon
  • TUI development: bubble tea (Go), ratatui (Rust), or similar
  • Branch references: Git refs and HEAD management

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Understanding of Git branches, basic TUI or CLI experience


Real World Outcome

You’ll have a TUI for managing worktrees:

Example Output:

$ wt

┌─ Git Worktree Manager ─────────────────────────────────┐
│                                                        │
│  Repository: my-project                                │
│  Main worktree: /Users/dev/my-project                  │
│                                                        │
│  Worktrees:                                            │
│  ┌────────────────────────────────────────────────────┐│
│  │ ● main         /Users/dev/my-project           [M] ││
│  │   feature/auth /Users/dev/my-project-auth          ││
│  │   hotfix/bug   /Users/dev/my-project-hotfix        ││
│  │   experiment   /Users/dev/my-project-exp       [*] ││
│  └────────────────────────────────────────────────────┘│
│                                                        │
│  [M] = main worktree  [*] = current                    │
│                                                        │
│  Commands:                                             │
│  [n] New worktree  [d] Delete  [o] Open in editor      │
│  [c] Clean up      [r] Refresh [q] Quit                │
│                                                        │
└────────────────────────────────────────────────────────┘

> n (new worktree)

Branch name (or new branch): feature/new-api
Directory (default: ../my-project-new-api):

Creating worktree...
✓ Created worktree at /Users/dev/my-project-new-api
✓ Checked out branch feature/new-api

Open in editor? [y/N]: y
✓ Opened in VS Code

$ wt status
3 worktrees active:
  main         → /Users/dev/my-project (main worktree)
  feature/auth → /Users/dev/my-project-auth (2 days old, 5 commits)
  feature/new  → /Users/dev/my-project-new-api (just created)

$ wt clean
Scanning for orphaned worktrees...
Found 1 orphaned worktree:
  /Users/dev/my-project-old-feature (branch deleted)

Remove orphaned worktrees? [y/N]: y
✓ Removed 1 orphaned worktree

The Core Question You’re Answering

“How do you work on multiple branches simultaneously without constant switching?”

Before you write any code, sit with this question. Worktrees let you have multiple working directories sharing one .git database. Each worktree is an independent checkout—you can build one branch while testing another.


Concepts You Must Understand First

Stop and research these before coding:

  1. Worktree Mechanics
    • How does a worktree relate to the main .git directory?
    • Why can’t two worktrees have the same branch checked out?
    • Where does Git store worktree metadata?
    • Book Reference: “Pro Git” Ch. 7.11 — Chacon
  2. Branch Locking
    • What error do you get when trying to checkout a branch that’s in another worktree?
    • How do you find which worktree has a branch?
    • What happens when you delete a branch that’s checked out in a worktree?
    • Resource: git worktree --help
  3. TUI Development
    • How do you handle keyboard input in a terminal?
    • How do you draw boxes and update the screen?
    • What libraries exist for your language?
    • Resource: Bubble Tea (Go), ratatui (Rust), or curses (Python)

Questions to Guide Your Design

Before implementing, think through these:

  1. Worktree Creation
    • Should you create a new branch or checkout an existing one?
    • What’s a sensible naming convention for worktree directories?
    • How do you handle creation in a specific path vs. automatic path?
  2. Navigation
    • How do you switch between worktrees (cd, open editor, etc.)?
    • How do you show which worktree is “current”?
    • How do you handle worktrees on remote branches?
  3. Cleanup
    • How do you detect orphaned worktrees (deleted branch, missing directory)?
    • Should cleanup be automatic or manual?
    • How do you handle worktrees with uncommitted changes?

Thinking Exercise

Explore Worktrees Manually

Set up worktrees and explore:

git init worktree-test && cd worktree-test
echo "main" > file.txt && git add . && git commit -m "initial"
git branch feature-a
git branch feature-b

# Create worktrees
git worktree add ../test-feature-a feature-a
git worktree add ../test-feature-b feature-b

# Explore
git worktree list
ls -la .git/worktrees/
cat .git/worktrees/test-feature-a/HEAD

Questions while exploring:

  • What’s in .git/worktrees/?
  • What happens if you try git checkout feature-a in the main worktree?
  • If you delete ../test-feature-a/, what does git worktree list show?
  • What does git worktree prune do?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “What are Git worktrees and when would you use them?”
  2. “How do worktrees differ from just cloning the repository again?”
  3. “What happens to a worktree when its branch is deleted?”
  4. “How would you use worktrees in a CI/CD pipeline?”
  5. “What’s the relationship between the main .git directory and worktree checkouts?”

Hints in Layers

Hint 1: Starting Point git worktree list --porcelain gives machine-parseable output. Start by parsing this.

Hint 2: Worktree Info The porcelain output has worktree, HEAD, branch lines for each worktree. Parse into structs.

Hint 3: Creation git worktree add <path> <branch> creates a worktree. Add -b <branch> to create a new branch.

Hint 4: TUI Framework Use a framework rather than raw terminal codes. In Go, Bubble Tea is excellent. In Rust, ratatui.


Books That Will Help

Topic Book Chapter
Git worktrees “Pro Git” by Chacon Ch. 7.11
TUI development Bubble Tea documentation Getting started
CLI design “The Linux Command Line” by Shotts Ch. 30

Implementation Hints

Worktree list parsing:

type Worktree struct {
    Path   string
    Head   string
    Branch string
}

func listWorktrees() ([]Worktree, error) {
    out, _ := exec.Command("git", "worktree", "list", "--porcelain").Output()
    lines := strings.Split(string(out), "\n")

    var worktrees []Worktree
    var current Worktree

    for _, line := range lines {
        if strings.HasPrefix(line, "worktree ") {
            current.Path = strings.TrimPrefix(line, "worktree ")
        } else if strings.HasPrefix(line, "HEAD ") {
            current.Head = strings.TrimPrefix(line, "HEAD ")
        } else if strings.HasPrefix(line, "branch ") {
            current.Branch = strings.TrimPrefix(line, "branch refs/heads/")
        } else if line == "" && current.Path != "" {
            worktrees = append(worktrees, current)
            current = Worktree{}
        }
    }
    return worktrees, nil
}

Learning milestones:

  1. You can list and parse worktrees → You understand worktree metadata
  2. You can create and delete worktrees → You understand worktree lifecycle
  3. You can detect orphans and clean up → You understand worktree maintenance
  4. Your TUI is pleasant to use → You have a productivity tool

Project 13: Repository Analytics Dashboard

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Python
  • Alternative Programming Languages: TypeScript, Go, Rust
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Data Analysis / Git Log Parsing / Visualization
  • Software or Tool: Git, Matplotlib/D3.js
  • Main Book: Software Engineering at Google by Winters et al.

What you’ll build: A dashboard that analyzes repository history to show contribution patterns, code hotspots, team dynamics, and technical debt indicators—like a mini version of GitPrime or Pluralsight Flow.

Why it teaches Git workflows: Understanding how a team uses Git reveals workflow health. By mining git log data, you’ll see how commit frequency, merge patterns, and contributor activity reflect team practices—and how to improve them.

Core challenges you’ll face:

  • Parsing git log efficiently → maps to understanding Git’s output formats
  • Calculating metrics → maps to understanding software engineering metrics
  • Visualizing trends → maps to understanding data presentation
  • Detecting patterns → maps to understanding code evolution

Key Concepts:

  • Git log formats: Pro Git Ch. 2.3 — Chacon
  • Software metrics: Software Engineering at Google Ch. 7 — Winters et al.
  • Data visualization: Matplotlib/D3.js documentation

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Data analysis basics, understanding of Git history


Real World Outcome

You’ll have a dashboard that reveals repository insights:

Example Output:

$ repo-analytics /path/to/repo --period "last 6 months"

=== Repository Analytics Dashboard ===

📊 OVERVIEW
─────────────────────────────────────────
Repository: awesome-project
Period: Jul 2024 - Jan 2025 (6 months)
Commits: 847
Contributors: 12
Lines changed: +45,231 / -12,847

📈 COMMIT ACTIVITY
─────────────────────────────────────────
Monthly commits:
  Jul ████████████████████ 156
  Aug ████████████████ 132
  Sep ███████████████████ 148
  Oct ██████████████████████ 178
  Nov ███████████████ 121
  Dec ██████████ 82 (holiday season)
  Jan ████████ 30 (partial month)

Peak day: Tuesdays (avg 8.2 commits/day)
Quietest: Weekends (avg 0.8 commits/day)

👥 TOP CONTRIBUTORS
─────────────────────────────────────────
  alice    ████████████████ 287 commits (34%)
  bob      ██████████ 178 commits (21%)
  charlie  ████████ 142 commits (17%)
  diana    ██████ 98 commits (12%)
  others   ████ 142 commits (16%)

🔥 CODE HOTSPOTS (most frequently changed)
─────────────────────────────────────────
  src/api/handlers.ts       Modified 47 times by 6 authors
  src/core/parser.ts        Modified 38 times by 4 authors
  src/utils/validation.ts   Modified 35 times by 8 authors

  ⚠️ High churn files may indicate:
     - Complex logic needing simplification
     - Missing tests causing bugs
     - Feature instability

📉 TECHNICAL DEBT INDICATORS
─────────────────────────────────────────
  TODO/FIXME comments added: 23
  TODO/FIXME comments removed: 8
  Net debt: +15 (growing)

  Large commits (>500 lines): 12
  "WIP" or "fix" commits: 34

  Merge conflicts resolved: 28
  Reverted commits: 4

🔀 MERGE PATTERNS
─────────────────────────────────────────
  Merge commits: 89
  Squash merges: 156
  Rebase merges: 42

  Average PR size: 127 lines
  Average review time: 1.8 days
  PRs merged without review: 12 (7%)

📁 FILE TYPE DISTRIBUTION
─────────────────────────────────────────
  TypeScript: 67% (24,521 LOC)
  JSON: 12% (4,891 LOC)
  Markdown: 8% (3,211 LOC)
  YAML: 5% (1,678 LOC)
  Other: 8%

The Core Question You’re Answering

“What does Git history reveal about a team’s development practices and code health?”

Before you write any code, sit with this question. Every commit tells a story—patterns in commits, authors, file changes, and timing reveal how a team works, where problems lurk, and what might need attention.


Concepts You Must Understand First

Stop and research these before coding:

  1. Git Log Formats
    • What fields can you extract from git log?
    • How do you use --format for custom output?
    • How do you efficiently iterate through large histories?
    • Book Reference: “Pro Git” Ch. 2.3 — Chacon
  2. Software Engineering Metrics
    • What’s code churn and why does it matter?
    • What’s bus factor and how do you calculate it?
    • What metrics indicate healthy vs. unhealthy repos?
    • Book Reference: “Software Engineering at Google” Ch. 7 — Winters et al.
  3. Data Visualization
    • How do you choose the right chart type?
    • How do you present trends over time?
    • How do you make terminal-based visualizations?
    • Resource: Matplotlib / D3.js documentation

Questions to Guide Your Design

Before implementing, think through these:

  1. Data Collection
    • What git commands give you the data you need?
    • How do you handle repositories with millions of commits?
    • How do you normalize data across different time periods?
  2. Metric Calculation
    • What metrics are genuinely useful vs. vanity metrics?
    • How do you account for different commit styles (small vs. large)?
    • How do you identify outliers (bot commits, merges, etc.)?
  3. Presentation
    • Should output be terminal, HTML, or JSON?
    • How do you make insights actionable?
    • What warnings or recommendations should you provide?

Thinking Exercise

Analyze a Real Repository

Pick an open source repository and analyze manually:

# Clone a popular project
git clone https://github.com/microsoft/vscode --depth 1000

# Analyze
git log --format="%H|%an|%ae|%at|%s" --numstat | head -100
git shortlog -sn | head -10
git log --since="6 months ago" --oneline | wc -l

Questions while analyzing:

  • Who are the top contributors?
  • What files change most frequently?
  • What patterns do you see in commit messages?
  • What would you want to know about this project’s health?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “What metrics would you track to measure developer productivity?”
  2. “How would you identify technical debt from Git history?”
  3. “What does high code churn indicate, and is it always bad?”
  4. “How would you calculate the ‘bus factor’ for a repository?”
  5. “What insights can you derive from merge patterns?”

Hints in Layers

Hint 1: Starting Point Use git log --format="%H|%an|%at|%s" --numstat to get commits with file stats. Parse the output into structured data.

Hint 2: Performance For large repos, use --since and --until to limit scope. Process in streams, don’t load everything into memory.

Hint 3: Hotspot Detection Count how often each file appears in commits. Files with high counts AND multiple authors are likely complex.

Hint 4: Terminal Charts Use Unicode block characters (▁▂▃▄▅▆▇█) for simple bar charts. Libraries like asciichart can help.


Books That Will Help

Topic Book Chapter
Git log “Pro Git” by Chacon Ch. 2.3
Software metrics “Software Engineering at Google” by Winters et al. Ch. 7
Data visualization “The Visual Display of Quantitative Information” by Tufte Ch. 1-3

Implementation Hints

Git log parsing:

def parse_commits(repo_path, since=None):
    cmd = ["git", "-C", repo_path, "log",
           "--format=%H|%an|%at|%s"]
    if since:
        cmd.append(f"--since={since}")

    result = subprocess.run(cmd, capture_output=True, text=True)

    commits = []
    for line in result.stdout.strip().split("\n"):
        if "|" in line:
            hash_, author, timestamp, subject = line.split("|", 3)
            commits.append({
                "hash": hash_,
                "author": author,
                "timestamp": int(timestamp),
                "subject": subject
            })
    return commits

Learning milestones:

  1. You can parse git log efficiently → You understand Git’s output formats
  2. You can calculate meaningful metrics → You understand software engineering metrics
  3. You can visualize trends → You understand data presentation
  4. Your insights are actionable → You have a useful analytics tool

Project 14: Git Secret Scanner

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Rust
  • Alternative Programming Languages: Go, Python
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Security / Pattern Matching / Git History
  • Software or Tool: Git
  • Main Book: Practical Binary Analysis by Andriesse

What you’ll build: A tool that scans Git history (not just current files) for accidentally committed secrets—API keys, passwords, tokens—and can optionally help remove them from history.

Why it teaches Git workflows: Security is a critical part of Git workflows. By building a scanner, you’ll understand how secrets persist in history even after deletion, how tools like git-secrets and truffleHog work, and how to use git filter-branch or BFG to rewrite history.

Core challenges you’ll face:

  • Pattern matching for secrets → maps to understanding secret patterns
  • Scanning all of history efficiently → maps to understanding Git object traversal
  • History rewriting → maps to understanding filter-branch and BFG
  • Minimizing false positives → maps to understanding entropy analysis

Key Concepts:

  • Secret patterns: Regular expressions for common secret formats
  • Entropy analysis: High-entropy strings are likely secrets
  • History rewriting: Pro Git Ch. 7.6 — Chacon (filter-branch)

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Regex, understanding of security basics, Project 1 completed


Real World Outcome

You’ll have a security tool for Git repositories:

Example Output:

$ secret-scan /path/to/repo

=== Git Secret Scanner ===
Scanning repository: my-project
Mode: Full history scan

Scanning 1,247 commits...
[████████████████████████████████████████] 100%

⚠️  SECRETS FOUND: 7

HIGH SEVERITY:
─────────────────────────────────────────
1. AWS Access Key
   Commit: abc1234 (2023-06-15)
   Author: alice@example.com
   File: config/prod.env (line 12)
   Pattern: AKIAIOSFODNN7EXAMPLE
   Status: ❌ Still in current HEAD

2. GitHub Personal Access Token
   Commit: def5678 (2023-08-22)
   Author: bob@example.com
   File: scripts/deploy.sh (line 45)
   Pattern: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
   Status: ✓ Deleted in later commit (but still in history!)

MEDIUM SEVERITY:
─────────────────────────────────────────
3. Private RSA Key
   Commit: ghi9012 (2023-09-01)
   File: .ssh/id_rsa
   Status: ✓ Removed via .gitignore

4-7. ... (additional findings)

=== RECOMMENDATIONS ===

IMMEDIATE ACTIONS:
1. Rotate the AWS key AKIAIOSFODNN7EXAMPLE immediately
2. Revoke the GitHub token ghp_xxx...

HISTORY CLEANUP:
To remove secrets from history, run:
$ secret-scan clean --commits abc1234,def5678

⚠️  This will rewrite history. Coordinate with your team!
    All clones will need to re-clone or rebase.

$ secret-scan clean --commits abc1234

Rewriting history to remove secrets...
Using BFG Repo-Cleaner strategy...

Before: abc1234 → config/prod.env contains AKIAIOSFODNN7...
After:  xyz7890 → config/prod.env contains ***REDACTED***

⚠️  Force push required:
    git push --force-with-lease origin main

Secret successfully removed from history!

The Core Question You’re Answering

“How do secrets persist in Git history, and how do you find and remove them?”

Before you write any code, sit with this question. When you delete a file with secrets, Git still has the old version in its history. Even after a commit is no longer reachable from any branch, it exists in the object database until garbage collection.


Concepts You Must Understand First

Stop and research these before coding:

  1. Common Secret Patterns
    • What regex patterns match AWS keys, GitHub tokens, etc.?
    • How do you balance sensitivity (catch all secrets) vs. specificity (minimize false positives)?
    • What’s entropy analysis and how does it help?
    • Resource: truffleHog source code, GitGuardian patterns
  2. Git History Traversal
    • How do you iterate through all commits and their files?
    • How do you access file content at each commit without checkout?
    • How do you handle large repositories efficiently?
    • Book Reference: “Pro Git” Ch. 10 — Chacon
  3. History Rewriting
    • What’s the difference between filter-branch, filter-repo, and BFG?
    • What are the consequences of rewriting pushed history?
    • How do you coordinate history rewrites with a team?
    • Book Reference: “Pro Git” Ch. 7.6 — Chacon

Questions to Guide Your Design

Before implementing, think through these:

  1. Detection Strategy
    • Should you scan every file in every commit or be smarter?
    • How do you handle binary files?
    • How do you prioritize findings by severity?
  2. Pattern Database
    • How do you organize patterns for different secret types?
    • How do you let users add custom patterns?
    • How do you handle false positives (UUIDs that look like tokens)?
  3. Remediation
    • Should you just report, or also offer to clean up?
    • How do you handle secrets that are in current HEAD vs. only in history?
    • What warnings do you give about history rewriting?

Thinking Exercise

Create and Detect a Secret

Commit a secret and trace what happens:

git init secret-test && cd secret-test
echo "AWS_KEY=AKIAIOSFODNN7EXAMPLE" > config.env
git add . && git commit -m "Add config"

# "Delete" the secret
rm config.env
git add . && git commit -m "Remove config"

# Is it really gone?
git log --all --oneline
git show HEAD~1:config.env
git log -p --all -S "AKIAIOSFODNN7"

Questions while tracing:

  • Can you still see the secret?
  • What git commands reveal it?
  • What would git gc do?
  • How would you truly remove it?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “How would you detect secrets in a Git repository’s history?”
  2. “What’s the difference between deleting a file and removing it from Git history?”
  3. “How would you handle a situation where a secret was pushed to a public repository?”
  4. “What’s entropy analysis and how does it help detect secrets?”
  5. “What are the risks of using git filter-branch on a shared repository?”

Hints in Layers

Hint 1: Starting Point Use git log --all --pretty=format:"%H" -- "*" to get all commits, then git show <commit>:<path> to get file contents.

Hint 2: Pattern Matching Create a pattern database with regexes like AKIA[0-9A-Z]{16} for AWS keys, ghp_[A-Za-z0-9]{36} for GitHub tokens.

Hint 3: Entropy Analysis Calculate Shannon entropy of strings. High entropy (> 4.5 bits/char) suggests randomness (keys, tokens). Low entropy is probably normal code.

Hint 4: History Cleaning Use git filter-repo (successor to filter-branch) or BFG Repo-Cleaner. They’re faster and safer than raw filter-branch.


Books That Will Help

Topic Book Chapter
History rewriting “Pro Git” by Chacon Ch. 7.6
Pattern matching “Mastering Regular Expressions” by Friedl Ch. 4-5
Security practices “The DevOps Handbook” by Kim et al. Ch. 20

Implementation Hints

Entropy calculation:

fn entropy(s: &str) -> f64 {
    let mut freq = [0u32; 256];
    for byte in s.bytes() {
        freq[byte as usize] += 1;
    }

    let len = s.len() as f64;
    freq.iter()
        .filter(|&&count| count > 0)
        .map(|&count| {
            let p = count as f64 / len;
            -p * p.log2()
        })
        .sum()
}

fn is_likely_secret(s: &str) -> bool {
    s.len() >= 20 && entropy(s) > 4.5
}

Learning milestones:

  1. You can scan files for pattern matches → You understand secret detection
  2. You can traverse all history efficiently → You understand Git object traversal
  3. You can calculate entropy to reduce false positives → You understand advanced detection
  4. You can suggest or perform history cleanup → You understand remediation

Final Overall Project: Git Workflow Platform

  • File: LEARN_ADVANCED_GIT_WORKFLOWS.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Rust, TypeScript
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 5: Master
  • Knowledge Area: Full-Stack / Platform Engineering / Git Internals
  • Software or Tool: Git, GitHub, PostgreSQL
  • Main Book: Software Engineering at Google by Winters et al.

What you’ll build: A comprehensive Git workflow platform that combines multiple projects—a web UI for managing repositories, integrated trunk-based development tooling, automatic code review, monorepo support, and analytics—essentially a mini version of GitHub’s enterprise features or Graphite.

Why this is the capstone: This project synthesizes everything you’ve learned. You’ll integrate Git internals knowledge with workflow automation, build tools that work together, and create a platform that could genuinely improve a team’s development experience.

Core challenges you’ll face:

  • Integrating multiple subsystems → maps to understanding system architecture
  • Building a web interface for Git operations → maps to understanding Git over HTTP
  • Handling concurrency and scale → maps to understanding production systems
  • Providing a cohesive user experience → maps to understanding developer tooling

Key Concepts:

  • Platform architecture: Software Engineering at Google Ch. 16-18 — Winters et al.
  • Git HTTP protocol: Git documentation on smart HTTP
  • Web application design: Designing Data-Intensive Applications — Kleppmann

Difficulty: Master Time estimate: 3-6 months Prerequisites: All previous projects completed


Real World Outcome

You’ll have a platform that makes Git workflows seamless:

┌─────────────────────────────────────────────────────────────────┐
│  Git Workflow Platform - Dashboard                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Repositories                         Quick Actions             │
│  ┌─────────────────────────────┐     ┌─────────────────────┐   │
│  │ ● my-monorepo     [healthy] │     │ New Repository      │   │
│  │ ● api-service     [2 PRs]   │     │ Import from GitHub  │   │
│  │ ● web-frontend    [warning] │     │ Run Analytics       │   │
│  └─────────────────────────────┘     └─────────────────────┘   │
│                                                                 │
│  Active Stacks                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │ auth-refactor (3 PRs)                                   │   │
│  │   PR #127 ✓ → PR #128 ⏳ → PR #129 ⏳                    │   │
│  │   Next: Merge #127, auto-rebase stack                   │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  Trunk Health                                                   │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │ ✓ All tests passing                                     │   │
│  │ ✓ No stale branches (oldest: 1.2 days)                  │   │
│  │ ⚠️ 2 PRs awaiting review > 24h                          │   │
│  │ ❌ Secret detected in commit abc123 (alert sent)        │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  Recent Activity                                                │
│  10:32 alice merged PR #125 into main                          │
│  10:28 Bot: Updated changelog for v2.3.0                       │
│  10:15 bob opened PR #130 "Add dark mode"                      │
│  09:45 CI: All 847 tests passed on main                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Features:

  • Repository management with analytics
  • Stacked PR management with automatic rebasing
  • Trunk-based development enforcement
  • Automated code review bot
  • Secret scanning with alerts
  • Conventional commit enforcement
  • Automatic changelog generation
  • Monorepo affected detection

Learning milestones:

  1. You can design a multi-service architecture → You understand platform design
  2. You can integrate Git operations with a web UI → You understand Git’s interfaces
  3. You can handle concurrent Git operations → You understand locking and transactions
  4. Your platform improves developer workflow → You’ve built something genuinely useful

Project Comparison Table

# Project Difficulty Time Depth Fun Factor
1 Git Object Explorer Intermediate Weekend ★★★★☆ ★★★☆☆
2 Commit Graph Visualizer Intermediate 1-2 weeks ★★★★★ ★★★★☆
3 Interactive Rebase Simulator Advanced 1-2 weeks ★★★★★ ★★★★☆
4 Three-Way Merge Engine Expert 2-4 weeks ★★★★★ ★★★☆☆
5 Git Hooks Framework Intermediate 1 week ★★★☆☆ ★★★☆☆
6 Trunk-Based Dev Pipeline Advanced 2-3 weeks ★★★★☆ ★★★★☆
7 Code Review Bot Advanced 2-3 weeks ★★★☆☆ ★★★★★
8 Monorepo Task Runner Expert 1 month+ ★★★★★ ★★★★★
9 Git Bisect Automator Intermediate 1 week ★★★☆☆ ★★★★☆
10 Stacked PRs Manager Advanced 2-3 weeks ★★★★★ ★★★★★
11 Conventional Commits Enforcer Intermediate 1 week ★★★☆☆ ★★★☆☆
12 Git Worktree Manager Intermediate 1-2 weeks ★★★☆☆ ★★★★☆
13 Repository Analytics Advanced 2-3 weeks ★★★★☆ ★★★★★
14 Git Secret Scanner Advanced 2-3 weeks ★★★★☆ ★★★★☆
15 Git Workflow Platform Master 3-6 months ★★★★★ ★★★★★

Recommendation

For beginners to Git internals: Start with Project 1 (Git Object Explorer), then Project 2 (Commit Graph Visualizer). These build foundational understanding of what Git actually stores and how history works.

For intermediate developers: Start with Project 5 (Git Hooks Framework) if you want immediate practical value, or Project 3 (Interactive Rebase Simulator) if you want to deepen your understanding of rebase.

For advanced developers: Go straight to Project 8 (Monorepo Task Runner) or Project 10 (Stacked PRs Manager). These are the most impactful for real-world workflows.

Recommended progression for maximum learning:

  1. Project 1 → 2 → 3 (Git internals foundation)
  2. Project 5 → 11 → 6 (Workflow automation)
  3. Project 7 → 13 → 14 (Code quality and security)
  4. Project 10 → 8 (Advanced workflow tooling)
  5. Final Project (Integration and mastery)

Summary

This learning path covers advanced Git workflows through 15 hands-on projects. Here’s the complete list:

# Project Name Main Language Difficulty Time Estimate
1 Git Object Explorer Python Intermediate Weekend
2 Commit Graph Visualizer Python Intermediate 1-2 weeks
3 Interactive Rebase Simulator Python Advanced 1-2 weeks
4 Three-Way Merge Engine C Expert 2-4 weeks
5 Git Hooks Framework Bash/Python Intermediate 1 week
6 Trunk-Based Development Pipeline Python Advanced 2-3 weeks
7 Code Review Bot Python Advanced 2-3 weeks
8 Monorepo Task Runner Rust Expert 1 month+
9 Git Bisect Automator Python Intermediate 1 week
10 Stacked PRs Manager Go Advanced 2-3 weeks
11 Conventional Commits Enforcer Rust Intermediate 1 week
12 Git Worktree Manager Go Intermediate 1-2 weeks
13 Repository Analytics Dashboard Python Advanced 2-3 weeks
14 Git Secret Scanner Rust Advanced 2-3 weeks
15 Git Workflow Platform (Capstone) Go Master 3-6 months

For beginners: Start with projects #1, #2, #5 For intermediate: Jump to projects #3, #6, #9, #11 For advanced: Focus on projects #8, #10, #14, #15

Expected Outcomes

After completing these projects, you will:

  • Understand Git’s internal object model and how commits form a DAG
  • Master rebase strategies and know when to use merge vs. rebase
  • Implement trunk-based development with feature flags
  • Build tools that automate code review and enforce quality
  • Create monorepo tooling for affected detection and caching
  • Detect and remediate security issues in Git history
  • Design and build Git workflow platforms

You’ll have built 15 working projects that demonstrate deep understanding of Git workflows from first principles—from parsing objects to building enterprise-grade platforms.