LEARN ADVANCED GIT WORKFLOWS
Learn Advanced Git Workflows: From Basic Commits to Monorepo Mastery
Goal: Deeply understand how Git actually works under the hood—from the object database and DAG structure to advanced workflows like trunk-based development, sophisticated rebase strategies, professional code review flows, and monorepo patterns. You’ll understand not just the commands, but why they work, what problems they solve, and how to architect Git workflows for teams of any size.
Why Advanced Git Workflows Matter
In 2005, Linus Torvalds created Git in just two weeks to manage Linux kernel development after the proprietary BitKeeper revoked its free license. His design was radical: a distributed version control system where every developer has the complete history, where branching is nearly instantaneous, and where the data model is built on cryptographic integrity.
The scale of Git’s impact:
- Over 100 million repositories on GitHub alone
- Linux kernel: 1.3+ million commits, 25,000+ contributors
- Google’s monorepo: 2+ billion lines of code, 86TB of data
- Microsoft’s Windows: 3.5 million files, largest Git repo ever migrated
Why most developers never go beyond basics:
- Git’s porcelain (user-facing) commands hide the plumbing (internal operations)
- Most tutorials teach “git add, commit, push” without explaining the object model
- Workflow decisions (merge vs. rebase, trunk vs. feature branches) are made without understanding tradeoffs
- Monorepo challenges are only discovered at scale
What understanding workflows unlocks:
- Debug any Git situation by understanding the underlying data structure
- Choose the right workflow for your team’s needs
- Implement CI/CD pipelines that leverage Git’s capabilities
- Scale repositories from solo projects to enterprise monorepos
Core Concept Analysis
The Git Object Model: Everything is Content-Addressable
Before understanding workflows, you must understand Git’s foundation: the object database.
┌─────────────────────────────────────────────────────────────────┐
│ .git/objects/ │
├─────────────────────────────────────────────────────────────────┤
│ │
│ BLOB (file content) TREE (directory) │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ "Hello, World!" │ │ 100644 blob abc │ │
│ │ │ │ → README │ │
│ │ SHA: abc123... │ │ 040000 tree def │ │
│ └──────────────────┘ │ → src/ │ │
│ │ SHA: def456... │ │
│ └──────────────────┘ │
│ │
│ COMMIT TAG │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ tree: def456 │ │ object: ghi789 │ │
│ │ parent: 000000 │ │ type: commit │ │
│ │ author: Alice │ │ tag: v1.0.0 │ │
│ │ message: "Init" │ │ tagger: Bob │ │
│ │ SHA: ghi789... │ │ SHA: jkl012... │ │
│ └──────────────────┘ └──────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Key insight: Git stores SNAPSHOTS, not diffs.
Each commit points to a complete tree of the entire project.
The Commit DAG (Directed Acyclic Graph)
Git history is not a linear sequence—it’s a graph where commits point to their parents.
Simple linear history:
A ← B ← C ← D ← E (HEAD → main)
Branching:
A ← B ← C ← D ← E (HEAD → main)
↖
F ← G ← H (feature)
After merge:
A ← B ← C ← D ← E ← M (HEAD → main)
↖ ↗
F ← G ← H (feature)
After rebase (feature onto main):
A ← B ← C ← D ← E (main)
↖
F' ← G' ← H' (HEAD → feature)
Note: F', G', H' are NEW commits with different SHAs
(same content, different parent = different hash)
Trunk-Based Development vs. Feature Branch Flow
┌─────────────────────────────────────────────────────────────────┐
│ GITFLOW (Long-lived branches) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ main ─────○─────────────────○─────────────────○──── │
│ ↑ ↑ ↑ │
│ release ─────┼───○───○────────┼───○───○────────┼───── │
│ │ ↑ ↑ │ ↑ ↑ │ │
│ develop ─○─○─┼─○─┼───┼─○─○─○──┼───┼───┼─○─○────┼──── │
│ ↑ │ │ │ ↑ ↑ │ │ │ ↑ │ │
│ feature/ └───┘ │ │ └───┘ │ │ │ └──────┘ │
│ ↑ │ │ │ │ │ │
│ hotfix └───────┴───┘ └───┴───┘ │
│ │
│ Problems: Merge hell, long-lived branches, integration pain │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ TRUNK-BASED DEVELOPMENT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ main ────○──○──○──○──○──○──○──○──○──○──○──○──○──○──○───→ │
│ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ │
│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
│ short- └──┘ └──┘ └──┘ └──┘ └──┘ └──┘ └──┘ └── │
│ lived │
│ branches (< 1 day, max 2 days) │
│ │
│ Key practices: │
│ • Feature flags for incomplete work │
│ • Small, frequent commits │
│ • Continuous integration on every push │
│ • No long-lived branches │
└─────────────────────────────────────────────────────────────────┘
Merge vs. Rebase: The Fundamental Tradeoff
BEFORE (same starting point):
A ← B ← C (main)
↖
D ← E (feature)
┌─────────────────────────────────────────────────────────────────┐
│ MERGE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ A ← B ← C ←───── M (main after merge) │
│ ↖ ↗ │
│ D ← E (feature) │
│ │
│ Pros: Cons: │
│ • Preserves exact history • "Merge commit" clutter │
│ • Non-destructive • Non-linear history │
│ • Safe for shared branches • Harder to read │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ REBASE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ A ← B ← C ← D' ← E' (feature after rebase onto main) │
│ │
│ Pros: Cons: │
│ • Clean, linear history • Rewrites history (new SHAs) │
│ • Easy to follow • NEVER on shared branches │
│ • Better for bisect • Can lose merge context │
└─────────────────────────────────────────────────────────────────┘
THE GOLDEN RULE:
"Never rebase commits that exist outside your repository"
(i.e., commits you've already pushed to shared branches)
Interactive Rebase: The Power Tool
git rebase -i HEAD~5
pick abc1234 Add user model
squash def5678 Fix typo in user model ← combines with previous
reword ghi9012 Add authentication ← edit commit message
edit jkl3456 Add password hashing ← pause here to amend
drop mno7890 WIP: debugging stuff ← delete this commit
Result: Clean, logical commit history ready for code review
┌────────────────────────────────────────────────────────────────┐
│ Commands available in interactive rebase: │
├────────────────────────────────────────────────────────────────┤
│ p, pick = use commit │
│ r, reword = use commit, but edit the commit message │
│ e, edit = use commit, but stop for amending │
│ s, squash = use commit, but meld into previous commit │
│ f, fixup = like "squash", but discard this commit's log msg │
│ x, exec = run command (the rest of the line) using shell │
│ d, drop = remove commit │
└────────────────────────────────────────────────────────────────┘
Code Review Flows: From PR to Merge
┌─────────────────────────────────────────────────────────────────┐
│ PULL REQUEST LIFECYCLE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Developer Reviewer │
│ ───────── ──────── │
│ │ │ │
│ │ 1. Create feature branch │ │
│ ├─────────────────────────> │ │
│ │ │ │
│ │ 2. Push commits │ │
│ ├─────────────────────────> │ │
│ │ │ │
│ │ 3. Open PR │ │
│ ├─────────────────────────> │ │
│ │ │ 4. Review code │
│ │ <─────────────────────────┤ │
│ │ Request changes │ │
│ │ │ │
│ │ 5. Address feedback │ │
│ ├─────────────────────────> │ │
│ │ (force push if rebase) │ │
│ │ │ │
│ │ <─────────────────────────┤ 6. Approve │
│ │ │ │
│ │ 7. Squash and merge │ │
│ └─────────────────────────> │ │
│ │ │
└─────────────────────────────────────────────────────────────────┘
Merge strategies on PR completion:
• Merge commit : Preserves all commits + merge commit
• Squash merge : All PR commits → single commit on main
• Rebase merge : Replay commits on top of main (no merge commit)
Monorepo Architecture
┌─────────────────────────────────────────────────────────────────┐
│ POLYREPO (Traditional) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ repo-api │ │ repo-web │ │ repo-lib │ │
│ ├──────────┤ ├──────────┤ ├──────────┤ │
│ │ .git/ │ │ .git/ │ │ .git/ │ │
│ │ src/ │ │ src/ │ │ src/ │ │
│ │ tests/ │ │ tests/ │ │ tests/ │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ ↑ ↑ ↑ │
│ └──────────────┴──────────────┘ │
│ npm install from registry │
│ │
│ Problems: Dependency versioning, cross-repo changes, │
│ inconsistent tooling, "diamond dependency" hell │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ MONOREPO (Single Repository) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ monorepo/ │ │
│ ├─────────────────────────────────────────────────────────┤ │
│ │ .git/ │ │
│ │ packages/ │ │
│ │ ├── api/ (can import from lib directly) │ │
│ │ ├── web/ (can import from lib directly) │ │
│ │ └── lib/ (shared code) │ │
│ │ tools/ │ │
│ │ └── build-system/ │ │
│ │ nx.json / turbo.json │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ Benefits: Atomic commits across packages, shared tooling, │
│ single version of dependencies, easier refactoring │
│ │
│ Challenges: Scale (clone time, CI time), permission control, │
│ finding what changed in large diffs │
└─────────────────────────────────────────────────────────────────┘
Monorepo tools comparison:
┌─────────────┬───────────────┬───────────────┬──────────────────┐
│ Tool │ Task Caching │ Affected Cmds │ Language Support │
├─────────────┼───────────────┼───────────────┼──────────────────┤
│ Nx │ Local+Remote │ Yes │ JS/TS, Go, Rust │
│ Turborepo │ Local+Remote │ Limited │ JS/TS primarily │
│ Bazel │ Remote │ Yes │ Polyglot │
│ Lerna │ No (legacy) │ Yes │ JS/TS │
│ Rush │ Local │ Yes │ JS/TS │
└─────────────┴───────────────┴───────────────┴──────────────────┘
Git Internals: The Plumbing Commands
┌─────────────────────────────────────────────────────────────────┐
│ PORCELAIN vs PLUMBING │
├─────────────────────────────────────────────────────────────────┤
│ │
│ PORCELAIN (User-friendly) PLUMBING (Low-level) │
│ ───────────────────────── ───────────────────── │
│ git add git hash-object │
│ git commit git update-index │
│ git checkout git read-tree │
│ git merge git write-tree │
│ git rebase git commit-tree │
│ git log git cat-file │
│ git status git ls-files │
│ git diff git rev-parse │
│ git update-ref │
│ git symbolic-ref │
│ │
│ Understanding plumbing = understanding Git │
└─────────────────────────────────────────────────────────────────┘
Example: What "git commit" actually does:
1. git write-tree → Create tree object from index
2. git commit-tree → Create commit object pointing to tree
3. git update-ref HEAD → Update HEAD to point to new commit
You can manually create commits using only plumbing commands!
Concept Summary Table
| Concept Cluster | What You Need to Internalize |
|---|---|
| Object Model | Git stores snapshots (blobs, trees, commits, tags) in a content-addressable database. Every object is identified by its SHA-1 hash. |
| The DAG | Commits form a directed acyclic graph where each commit points to its parent(s). Branches are just pointers to commits. |
| Merge vs. Rebase | Merge preserves history with merge commits; rebase rewrites history for linearity. Never rebase shared branches. |
| Trunk-Based Dev | All developers commit to main frequently (daily), using feature flags for incomplete work. Minimizes merge conflicts. |
| Interactive Rebase | Rewrite local history: squash, reorder, edit, drop commits. Essential for clean PRs. |
| Code Review Flow | PRs gate changes. Reviewers enforce quality. Squash/rebase on merge for clean main history. |
| Monorepo Patterns | Single repo for multiple projects. Requires affected commands, task caching, and smart CI to scale. |
| Plumbing Commands | Low-level commands that porcelain builds upon. Understanding these = understanding Git. |
Deep Dive Reading by Concept
Git Internals
| Concept | Book & Chapter |
|---|---|
| Object model (blobs, trees, commits) | Pro Git by Scott Chacon — Ch. 10.1-10.2: “Git Internals - Plumbing and Porcelain” |
| Pack files and garbage collection | Pro Git by Scott Chacon — Ch. 10.4: “Packfiles” |
| How refs work | Pro Git by Scott Chacon — Ch. 10.3: “Git References” |
Branching and Merging
| Concept | Book & Chapter |
|---|---|
| Branch mechanics | Pro Git by Scott Chacon — Ch. 3.1: “Branches in a Nutshell” |
| Merge strategies | Pro Git by Scott Chacon — Ch. 3.2: “Basic Branching and Merging” |
| Rebase fundamentals | Pro Git by Scott Chacon — Ch. 3.6: “Rebasing” |
| Advanced rebasing | Git Internals by Scott Chacon (Peepcode PDF) — Ch. 5: “Rebasing” |
Workflows
| Concept | Book & Chapter |
|---|---|
| Distributed workflows | Pro Git by Scott Chacon — Ch. 5.1-5.2: “Distributed Workflows” |
| Contributing to projects | Pro Git by Scott Chacon — Ch. 5.3: “Maintaining a Project” |
| Trunk-based development | Accelerate by Nicole Forsgren — Ch. 4: “Technical Practices” |
| Continuous delivery | Continuous Delivery by Jez Humble — Ch. 14: “Version Control” |
Monorepos
| Concept | Book & Chapter |
|---|---|
| Monorepo philosophy | Software Engineering at Google by Winters et al. — Ch. 16: “Version Control and Branch Management” |
| Scaling build systems | Software Engineering at Google — Ch. 18: “Build Systems and Build Philosophy” |
Essential Reading Order
For maximum comprehension, read in this order:
- Foundation (Week 1):
- Pro Git Ch. 10.1-10.3 (Git internals)
- Pro Git Ch. 3.1-3.2 (Branching basics)
- Intermediate (Week 2):
- Pro Git Ch. 3.6 (Rebasing)
- Pro Git Ch. 5.1-5.3 (Distributed workflows)
- Advanced (Week 3-4):
- Accelerate Ch. 4 (Trunk-based development)
- Software Engineering at Google Ch. 16 (Version control at scale)
Project 1: Git Object Explorer
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Python
- Alternative Programming Languages: Go, Rust, C
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Git Internals / Binary Parsing
- Software or Tool: Git
- Main Book: Pro Git by Scott Chacon
What you’ll build: A tool that explores the .git directory, decompresses and parses Git objects (blobs, trees, commits, tags), and displays their contents in human-readable format with SHA verification.
Why it teaches Git workflows: Before you can master advanced workflows, you need to see what Git actually stores. This project forces you to understand that commits are just text files with parent pointers, branches are just files containing SHAs, and the entire history is a content-addressable database.
Core challenges you’ll face:
- Decompressing zlib-compressed objects → maps to understanding Git’s storage format
- Parsing different object types → maps to understanding blob vs. tree vs. commit structure
- Following parent pointers to reconstruct history → maps to understanding the DAG
- Verifying SHA-1 hashes → maps to understanding content-addressability
Key Concepts:
- Object storage format: Pro Git Ch. 10.2 — Scott Chacon
- Zlib compression: Python
zlibmodule documentation - SHA-1 hashing: Serious Cryptography Ch. 6 — Aumasson
Difficulty: Intermediate Time estimate: Weekend Prerequisites: Python file I/O, basic understanding of hashing, familiarity with command-line git
Real World Outcome
You’ll have a command-line tool that can inspect any Git repository’s internal structure. When you run it, you’ll see the raw objects that make up Git’s database:
Example Output:
$ ./git-explorer /path/to/repo
=== Git Object Explorer ===
Repository: /path/to/repo
Scanning .git/objects...
Found 247 objects
--- Object: 3b18e512dba79e4c8300dd08aeb37f8e728b8dad ---
Type: commit
Size: 243 bytes
SHA verified: ✓
tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
parent a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0
author Alice <alice@example.com> 1703001600 -0800
committer Alice <alice@example.com> 1703001600 -0800
Add user authentication feature
--- Object: 4b825dc642cb6eb9a060e54bf8d69288fbee4904 ---
Type: tree
Size: 66 bytes
100644 blob abc123... README.md
100755 blob def456... src/main.py
040000 tree ghi789... tests/
--- Ref: refs/heads/main ---
Points to: 3b18e512dba79e4c8300dd08aeb37f8e728b8dad
$ ./git-explorer --follow 3b18e512
Commit graph from 3b18e512:
3b18e51 ← a1b2c3d ← 5e6f7g8 ← (root)
│
└── "Add user authentication feature"
The Core Question You’re Answering
“What IS a Git commit? What does Git actually store, and how does branching work at the byte level?”
Before you write any code, sit with this question. Most developers think of commits as “diffs” or “changes,” but Git stores complete snapshots. A branch isn’t a separate copy of files—it’s just a 41-byte file containing a SHA hash.
Concepts You Must Understand First
Stop and research these before coding:
- Content-Addressable Storage
- What does “content-addressable” mean?
- Why does changing one byte in a file create a completely different hash?
- How does this enable Git’s integrity checking?
- Book Reference: “Pro Git” Ch. 10.2 — Scott Chacon
- Zlib Compression
- What algorithm does zlib use (hint: DEFLATE)?
- Why does Git compress objects?
- How do you identify compressed vs. uncompressed data?
- Book Reference: Python
zlibmodule documentation
- Object Format
- What’s the header format for Git objects?
- How do blob, tree, and commit objects differ structurally?
- What’s the difference between object content and object hash input?
- Book Reference: “Pro Git” Ch. 10.2 — Scott Chacon
Questions to Guide Your Design
Before implementing, think through these:
- Object Discovery
- How will you find all objects in
.git/objects/? - What about packed objects in
.git/objects/pack/? - How do you handle the
xx/yyyyyy...directory structure?
- How will you find all objects in
- Parsing Strategy
- How will you detect the object type from the header?
- How will you handle null bytes in binary data?
- How will you parse tree entries (mode, name, SHA)?
- Verification
- How do you verify the SHA matches the content?
- What should happen if verification fails?
- How do you handle corrupted objects?
Thinking Exercise
Trace a Commit’s Components
Before coding, manually inspect a real Git object:
# In any git repo, find an object
$ ls .git/objects/
3b/ 4a/ 5c/ info/ pack/
$ ls .git/objects/3b/
18e512dba79e4c8300dd08aeb37f8e728b8dad
# Decompress and view it
$ python3 -c "import zlib; print(zlib.decompress(open('.git/objects/3b/18e512dba79e4c8300dd08aeb37f8e728b8dad', 'rb').read()))"
Questions while tracing:
- What’s the format of the header you see?
- How many null bytes separate header from content?
- If it’s a commit, what fields do you see?
- Can you manually verify the SHA by hashing “type size\0content”?
The Interview Questions They’ll Ask
Prepare to answer these:
- “What’s the difference between a Git blob and a Git tree?”
- “Why can’t you have two different files with the same content in a Git repo?”
- “What happens internally when you run
git add?” - “Explain why changing a single character in a file changes the commit hash of every ancestor.”
- “How does Git know if a file has been modified without storing diffs?”
Hints in Layers
Hint 1: Starting Point
Look inside .git/objects/. The first two characters of a SHA become a subdirectory name; the rest is the filename.
Hint 2: Reading Objects
Every object starts with: {type} {size}\0{content}. Use zlib.decompress() to get the raw bytes first.
Hint 3: Parsing Types
After decompression, split on the first null byte. Parse the header to get type and size. For trees, entries are: {mode} {filename}\0{20-byte SHA}.
Hint 4: Verification
To verify a SHA, compute: sha1(f"{type} {len(content)}\0{content}"). The result should match the filename.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Git object model | “Pro Git” by Scott Chacon | Ch. 10.1-10.2 |
| Binary file parsing in Python | “Black Hat Python” by Justin Seitz | Ch. 3 |
| Content-addressable storage | “Designing Data-Intensive Applications” by Kleppmann | Ch. 3 |
Implementation Hints
Git objects are stored as: {type} {size}\0{content}, then zlib-compressed, then placed in .git/objects/{first-2-chars-of-sha}/{remaining-38-chars}.
Object types:
- blob: Just raw file content
- tree: List of entries, each with mode (6 bytes ASCII), space, filename, null byte, then 20-byte binary SHA
- commit: Text with “tree”, “parent” (0 or more), “author”, “committer”, blank line, message
To build the DAG visualization, follow parent pointers recursively until you hit a commit with no parents (the root).
Learning milestones:
- You can decompress and read a blob → You understand the storage format
- You can parse tree entries and follow them → You understand directory structure
- You can follow parent pointers through commits → You understand the DAG
- You can verify SHAs match content → You understand content-addressability
Project 2: Commit Graph Visualizer
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Python
- Alternative Programming Languages: Go, Rust, JavaScript (D3.js for visualization)
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Graph Algorithms / Git DAG
- Software or Tool: Git, Graphviz or D3.js
- Main Book: Pro Git by Scott Chacon
What you’ll build: A tool that reads a Git repository and generates a visual graph showing commits, branches, merges, and tags—revealing the true DAG structure that underlies Git history.
Why it teaches Git workflows: When you visualize the DAG, you finally understand why rebase “rewrites history” (it creates new commits with different parents), why merge creates a commit with two parents, and how branches are just movable pointers.
Core challenges you’ll face:
- Walking the commit graph efficiently → maps to understanding parent relationships
- Detecting merge commits (multiple parents) → maps to understanding merge vs. rebase
- Positioning nodes for readability → maps to understanding branch topology
- Mapping refs to commits → maps to understanding branches as pointers
Key Concepts:
- DAG traversal: Grokking Algorithms Ch. 6 — Aditya Bhargava
- Git refs: Pro Git Ch. 10.3 — Scott Chacon
- Graph layout algorithms: Sugiyama algorithm for DAG visualization
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 completed, basic graph theory, understanding of topological sort
Real World Outcome
You’ll have a tool that generates visual representations of Git history, revealing the true graph structure:
Example Output:
$ ./git-graph --repo /path/to/repo --output graph.png
Analyzing repository...
Found 156 commits across 8 branches
Detected 12 merge commits
Generating graph...
Saved to: graph.png
ASCII preview:
* 3b18e51 (HEAD -> main) Merge feature-auth
|\
| * a1b2c3d Add password hashing
| * 5e6f7g8 Add login form
|/
* 9h0i1j2 Update README
* k3l4m5n Initial commit
$ ./git-graph --repo /path/to/repo --format svg --show-refs
[Opens browser with interactive SVG showing]:
- Commit nodes colored by author
- Branch labels at their current positions
- Merge commits highlighted
- Clickable nodes showing commit details
The Core Question You’re Answering
“What does ‘rewriting history’ actually mean, and why is the commit graph a DAG, not a tree?”
Before you write any code, sit with this question. A DAG (Directed Acyclic Graph) allows multiple parents (merges) and multiple children (branches), but no cycles. Understanding this structure explains why you can’t have a commit that’s its own ancestor.
Concepts You Must Understand First
Stop and research these before coding:
- Graph Theory Basics
- What’s the difference between a DAG and a tree?
- What’s topological sorting and why does it matter for Git?
- How do you detect cycles in a graph?
- Book Reference: “Grokking Algorithms” Ch. 6 — Bhargava
- Git References
- What’s the difference between a branch, a tag, and HEAD?
- What does “detached HEAD” mean?
- Where are refs stored in
.git/? - Book Reference: “Pro Git” Ch. 10.3 — Chacon
- Merge Commit Structure
- How does a merge commit differ from a regular commit?
- What are the first and second parents of a merge?
- What does
git log --first-parentshow? - Book Reference: “Pro Git” Ch. 3.2 — Chacon
Questions to Guide Your Design
Before implementing, think through these:
- Graph Construction
- How will you build the graph in memory?
- What data structure represents a commit node?
- How do you handle the fact that Git stores parents, not children?
- Layout Algorithm
- How do you position nodes so branches don’t overlap?
- How do you handle very long linear histories?
- Should time flow top-to-bottom or left-to-right?
- Branch Assignment
- A commit can be reachable from multiple branches—how do you show this?
- How do you determine the “main line” for display?
Thinking Exercise
Trace a Merge Manually
Create a test repository and trace what happens:
git init test-repo && cd test-repo
echo "initial" > file.txt && git add . && git commit -m "A"
git checkout -b feature
echo "feature" >> file.txt && git commit -am "B"
echo "more feature" >> file.txt && git commit -am "C"
git checkout main
echo "main work" >> file.txt && git commit -am "D"
git merge feature -m "E: Merge feature"
Questions while tracing:
- Draw the DAG on paper. How many parents does commit E have?
- If you now run
git log --oneline, what order do you see? - What about
git log --oneline --first-parent? - Look at
.git/refs/heads/— what files exist?
The Interview Questions They’ll Ask
Prepare to answer these:
- “Explain the difference between merge and rebase using a graph diagram.”
- “What happens to the commit graph when you force push after a rebase?”
- “How would you find the common ancestor of two branches programmatically?”
- “Why does
git logshow commits in the order it does?” - “What’s the complexity of finding if commit A is an ancestor of commit B?”
Hints in Layers
Hint 1: Starting Point
Read all refs from .git/refs/ (branches and tags) and HEAD. Each points to a commit SHA.
Hint 2: Graph Building Starting from each ref, walk parent pointers recursively. Store visited commits to avoid cycles (though Git guarantees no cycles).
Hint 3: Layout Assign each branch a “lane” (column). Commits on the same branch go in the same lane. Merge commits connect lanes.
Hint 4: Tools
Graphviz’s DOT format is simple: "sha1" -> "sha2" for edges. Let Graphviz handle layout with dot -Tpng.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Graph algorithms | “Grokking Algorithms” by Bhargava | Ch. 6 |
| Git internal refs | “Pro Git” by Chacon | Ch. 10.3 |
| DAG visualization | “Graph Drawing” by Tamassia | Ch. 9 |
Implementation Hints
Start by generating DOT format for Graphviz—it handles layout automatically:
digraph G {
rankdir=BT; // bottom to top
node [shape=circle];
"3b18e51" [label="E\nMerge"];
"a1b2c3d" [label="C"];
"5e6f7g8" [label="B"];
"9h0i1j2" [label="D"];
"k3l4m5n" [label="A"];
"3b18e51" -> "a1b2c3d"; // first parent
"3b18e51" -> "9h0i1j2"; // second parent (merge)
"a1b2c3d" -> "5e6f7g8";
"5e6f7g8" -> "k3l4m5n";
"9h0i1j2" -> "k3l4m5n";
}
For branch labels, use Graphviz node attributes to add color or labels at the commit the branch points to.
Learning milestones:
- You can walk parent pointers and build a graph → You understand the DAG structure
- You can identify merge commits by parent count → You understand merge mechanics
- You can map refs to commits → You understand branches as pointers
- You can generate readable visualizations → You can explain Git history to others
Project 3: Interactive Rebase Simulator
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Python
- Alternative Programming Languages: Go, Rust, TypeScript
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 3: Advanced
- Knowledge Area: Git Internals / State Machines
- Software or Tool: Git
- Main Book: Pro Git by Scott Chacon
What you’ll build: A tool that simulates git rebase -i without actually modifying the repository, showing exactly what commits would be created, dropped, or modified—and explaining why.
Why it teaches Git workflows: Interactive rebase is the most powerful and most misunderstood Git command. By simulating it step-by-step, you’ll understand that rebase doesn’t “move” commits—it creates new commits with the same changes but different parents (and therefore different SHAs).
Core challenges you’ll face:
- Parsing the todo list format → maps to understanding rebase operations
- Simulating commit replaying → maps to understanding how commits are recreated
- Handling squash/fixup → maps to understanding commit combination
- Predicting new SHAs → maps to understanding content-addressability
Key Concepts:
- Rebase internals: Pro Git Ch. 3.6 — Scott Chacon
- Interactive rebase commands: Pro Git Ch. 7.6 — Scott Chacon
- State machine design: Clean Code Ch. 10 — Martin
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Project 1 and 2 completed, understanding of rebase vs merge
Real World Outcome
You’ll have a tool that shows exactly what an interactive rebase would do, step by step:
Example Output:
$ ./rebase-sim --repo /path/to/repo --onto main --branch feature
=== Interactive Rebase Simulator ===
Simulating: git rebase -i main feature
Current branch 'feature' has 5 commits not on 'main':
abc1234 Add user model
def5678 Fix typo in user model
ghi9012 Add authentication
jkl3456 WIP debugging
mno7890 Add password hashing
Enter todo commands (or press Enter for default 'pick all'):
pick abc1234
squash def5678
pick ghi9012
drop jkl3456
pick mno7890
=== SIMULATION RESULTS ===
Step 1: pick abc1234 "Add user model"
Old SHA: abc1234...
New SHA: xyz7777... (different because parent changed!)
Old parent: 111222... (old main)
New parent: 999888... (current main tip)
Step 2: squash def5678 "Fix typo in user model"
Combining with previous commit...
New message will be:
"Add user model
Fix typo in user model"
Combined SHA: uvw4444...
Step 3: pick ghi9012 "Add authentication"
Old SHA: ghi9012...
New SHA: rst5555...
Step 4: drop jkl3456 "WIP debugging"
⚠️ This commit will be DELETED
Changes in this commit:
- src/debug.py (will be lost!)
Step 5: pick mno7890 "Add password hashing"
Old SHA: mno7890...
New SHA: pqr6666...
=== FINAL STATE ===
main: 999888... (unchanged)
feature: pqr6666... (was: mno7890...)
Commits on feature after rebase: 4 (was 5)
Total commits created: 4 new, 5 old destroyed
WARNING: 1 commit dropped. Changes may be lost!
The Core Question You’re Answering
“Why does rebase ‘rewrite history,’ and what does that actually mean at the commit level?”
Before you write any code, sit with this question. When you rebase, Git doesn’t move commits—it replays the changes onto a new base, creating entirely new commits. The old commits still exist (until garbage collection), but your branch pointer moves to the new chain.
Concepts You Must Understand First
Stop and research these before coding:
- Commit Identity
- What determines a commit’s SHA?
- If you change just the parent, what happens to the SHA?
- If you change the commit message, what happens to the SHA?
- Book Reference: “Pro Git” Ch. 10.2 — Chacon
- Rebase Operations
- What does each rebase command (pick, squash, fixup, reword, edit, drop) do?
- How does squash differ from fixup?
- What happens during a rebase conflict?
- Book Reference: “Pro Git” Ch. 7.6 — Chacon
- The Three-Way Merge
- How does Git apply changes from one commit onto another?
- What’s the “merge base” in a rebase context?
- Why can rebase produce different conflicts than merge?
- Book Reference: “Pro Git” Ch. 3.2 — Chacon
Questions to Guide Your Design
Before implementing, think through these:
- Simulation Fidelity
- How will you calculate what the new SHA would be without creating objects?
- Can you predict if there would be conflicts?
- How do you represent the state after each step?
- Commit Combination
- When squashing, how do you combine commit messages?
- When squashing, how do you combine tree states?
- What if squashed commits touched the same file?
- User Interface
- How do you present the todo list for editing?
- How do you show the diff between old and new state?
- How do you warn about potentially lost changes?
Thinking Exercise
Trace a Rebase Manually
Create and rebase a test branch:
git init test && cd test
echo "a" > file && git add . && git commit -m "A"
echo "b" > file && git commit -am "B"
git checkout -b feature
echo "c" > file && git commit -am "C"
echo "d" > file && git commit -am "D"
git checkout main
echo "e" > file && git commit -am "E"
git checkout feature
git log --oneline --all --graph # Note the SHAs
git rebase main
git log --oneline --all --graph # Compare the SHAs
Questions while tracing:
- What are the SHAs of C and D before rebase?
- What are the SHAs of C’ and D’ after rebase?
- Can you find the old commits with
git reflog? - What happened to the parent pointers?
The Interview Questions They’ll Ask
Prepare to answer these:
- “Explain what happens step-by-step when you run
git rebase mainfrom a feature branch.” - “Why should you never rebase commits that have been pushed to a shared branch?”
- “What’s the difference between
git rebase -iwith squash versus fixup?” - “How would you recover commits that were ‘lost’ during a rebase?”
- “When would you use rebase vs. merge, and what are the tradeoffs?”
Hints in Layers
Hint 1: Starting Point
Parse the todo file format: <command> <sha> <message>. The commands are: pick, reword, edit, squash, fixup, drop.
Hint 2: SHA Calculation
A commit’s SHA is sha1(f"commit {size}\0{content}"). The content includes tree, parent(s), author, committer, and message.
Hint 3: Squash Logic When squashing, the tree comes from applying both commits’ changes, and the message combines both (unless fixup, which discards the second message).
Hint 4: Conflict Detection To predict conflicts, you’d need to simulate the three-way merge. For this project, you can note “potential conflict” when the same file is modified.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Rebase in depth | “Pro Git” by Chacon | Ch. 3.6, 7.6 |
| Three-way merge | “Version Control with Git” by Loeliger | Ch. 9 |
| Reflog and recovery | “Pro Git” by Chacon | Ch. 7.3 |
Implementation Hints
The rebase todo format is straightforward:
pick abc1234 First commit message
squash def5678 Second commit message
reword ghi9012 Third commit message
For simulation, track the state as you process each line:
pick: New commit with same tree, new parentsquash/fixup: Combine with previous commitreword: New commit with modified messagedrop: Skip entirelyedit: Pause (in your simulation, just note it)
To compute what the new SHA would be, you need:
- The tree SHA (same as original for pick/reword/drop)
- The new parent SHA (previous simulated commit or rebase base)
- The author info (usually preserved)
- The committer info (YOU, at current time)
- The message
Learning milestones:
- You can parse and validate a todo list → You understand rebase operations
- You can simulate pick and compute new SHAs → You understand commit recreation
- You can simulate squash/fixup → You understand commit combination
- You can detect potential issues (drops, conflicts) → You understand rebase risks
Project 4: Three-Way Merge Engine
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Go, Python
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: Diff Algorithms / Merge Strategies
- Software or Tool: Git (for comparison)
- Main Book: The Algorithm Design Manual by Skiena
What you’ll build: A three-way merge tool that takes a base version, “ours” version, and “theirs” version of a file and produces either a merged result or conflict markers—exactly like Git does.
Why it teaches Git workflows: Every merge, rebase, and cherry-pick uses three-way merge internally. Understanding this algorithm explains why certain changes conflict and others don’t, and why merge is generally safer than manual patching.
Core challenges you’ll face:
- Implementing diff (longest common subsequence) → maps to understanding how changes are detected
- Handling non-conflicting parallel changes → maps to understanding automatic merge
- Detecting and marking conflicts → maps to understanding merge conflict format
- Choosing between merge strategies → maps to understanding recursive vs. resolve vs. octopus
Key Concepts:
- Longest Common Subsequence: The Algorithm Design Manual Ch. 8 — Skiena
- diff algorithm: “An O(ND) Difference Algorithm” — Eugene Myers
- Three-way merge: Pro Git Ch. 3.2 — Chacon
Difficulty: Expert Time estimate: 2-4 weeks Prerequisites: Projects 1-3 completed, dynamic programming, understanding of diff
Real World Outcome
You’ll have a merge tool that can combine file versions exactly like Git:
Example Output:
$ ./merge3 base.txt ours.txt theirs.txt
=== Three-Way Merge ===
Base version:
1: Hello World
2: This is a test
3: Goodbye
Ours version (changes on our branch):
1: Hello World
2: This is a test
3: This is our change
4: Goodbye
Theirs version (changes on their branch):
1: Hello World
2: Their modification here
3: This is a test
4: Goodbye
=== Diff Analysis ===
Line 2: THEIRS modified (base→theirs differs, base=ours)
Line 3: OURS added (ours has extra line)
=== Merge Result (no conflicts!) ===
1: Hello World
2: Their modification here
3: This is a test
4: This is our change
5: Goodbye
$ ./merge3 base.txt ours.txt theirs.txt --conflict-case
=== CONFLICT DETECTED ===
Both modified line 2:
Base: "This is a test"
Ours: "Our version of line 2"
Theirs: "Their version of line 2"
Merged output with conflict markers:
1: Hello World
<<<<<<< OURS
2: Our version of line 2
=======
2: Their version of line 2
>>>>>>> THEIRS
3: Goodbye
The Core Question You’re Answering
“How does Git know when changes can be automatically merged and when they conflict?”
Before you write any code, sit with this question. The key insight is the BASE version—Git doesn’t just compare two files, it compares both to their common ancestor. If only one side changed a line, that change can be applied automatically.
Concepts You Must Understand First
Stop and research these before coding:
- Longest Common Subsequence (LCS)
- What’s the difference between LCS and longest common substring?
- How does dynamic programming solve LCS in O(mn) time?
- How does LCS relate to computing diffs?
- Book Reference: “The Algorithm Design Manual” Ch. 8 — Skiena
- The Diff Algorithm
- How does Myers’ diff algorithm work?
- What’s an edit script?
- How do you go from LCS to a list of insertions/deletions?
- Paper: “An O(ND) Difference Algorithm” — Eugene Myers
- Three-Way Merge Logic
- What are the four possible states of a line (unchanged, ours-only, theirs-only, both)?
- When is a change non-conflicting?
- What’s the format of Git’s conflict markers?
- Book Reference: “Pro Git” Ch. 3.2 — Chacon
Questions to Guide Your Design
Before implementing, think through these:
- Diff Representation
- How will you represent a diff? As edit operations? As hunks?
- How do you handle lines that moved (not just added/deleted)?
- Should you diff by lines or by characters within lines?
- Merge Algorithm
- How do you align the three versions?
- What if ours and theirs made the same change?
- What if ours deleted a line that theirs modified?
- Conflict Handling
- How do you represent the conflict region?
- Should you include context lines?
- How do you handle nested conflicts?
Thinking Exercise
Trace a Merge Manually
Set up a conflict scenario:
git init merge-test && cd merge-test
echo -e "line1\nline2\nline3" > file.txt
git add . && git commit -m "Base"
git checkout -b feature
echo -e "line1\nfeature-line2\nline3" > file.txt
git commit -am "Feature change"
git checkout main
echo -e "line1\nmain-line2\nline3" > file.txt
git commit -am "Main change"
git merge feature # Will conflict!
cat file.txt # See conflict markers
Questions while tracing:
- Draw out base, ours, theirs for line 2
- Why did Git detect a conflict?
- What if only one branch had changed line 2?
- Look at
.git/MERGE_HEAD— what’s stored there?
The Interview Questions They’ll Ask
Prepare to answer these:
- “Explain the three-way merge algorithm. What is the ‘base’ and why is it important?”
- “What’s the time complexity of computing a diff between two files?”
- “Why might
git mergesucceed when manual file comparison would suggest a conflict?” - “What merge strategies does Git support, and when would you use each?”
- “How would you resolve a merge conflict where both sides made the same change?”
Hints in Layers
Hint 1: Starting Point Implement diff first. The simplest approach: compute LCS, then derive insertions/deletions from what’s NOT in the LCS.
Hint 2: LCS Algorithm
Use dynamic programming. Build a table where lcs[i][j] = length of LCS of first i lines of A and first j lines of B. Backtrack to find the actual sequence.
Hint 3: Three-Way Logic Compute diff(base, ours) and diff(base, theirs). For each line region, categorize: unchanged, ours-only, theirs-only, or conflict.
Hint 4: Conflict Markers Git’s format:
<<<<<<< HEAD
our content
=======
their content
>>>>>>> branch-name
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| LCS algorithm | “The Algorithm Design Manual” by Skiena | Ch. 8 |
| Diff algorithm | “An O(ND) Difference Algorithm” by Myers | Paper |
| Merge internals | “Version Control with Git” by Loeliger | Ch. 9 |
Implementation Hints
The LCS dynamic programming table:
"" l i n e 1
"" 0 0 0 0 0 0
l 0 1 1 1 1 1
i 0 1 2 2 2 2
n 0 1 2 3 3 3
...
Three-way merge pseudocode:
for each line region:
if base == ours == theirs:
output(base) # unchanged
elif base == ours and base != theirs:
output(theirs) # theirs changed, use theirs
elif base != ours and base == theirs:
output(ours) # ours changed, use ours
elif ours == theirs:
output(ours) # same change, either is fine
else:
output(conflict_markers(ours, theirs)) # conflict!
Learning milestones:
- You can compute LCS of two sequences → You understand dynamic programming for strings
- You can generate a diff from LCS → You understand edit scripts
- You can merge non-conflicting changes → You understand three-way merge logic
- You can generate proper conflict markers → You understand Git’s conflict format
Project 5: Git Hooks Framework
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Bash/Python
- Alternative Programming Languages: Go, Rust, Node.js
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 2: Intermediate
- Knowledge Area: Git Hooks / CI/CD
- Software or Tool: Git
- Main Book: Pro Git by Scott Chacon
What you’ll build: A Git hooks management system (like Husky, but from scratch) that allows configuring pre-commit, pre-push, and commit-msg hooks via a config file, with support for running multiple scripts per hook.
Why it teaches Git workflows: Hooks are how teams enforce code quality—running linters, tests, and format checks before code is committed or pushed. Understanding hooks is essential for implementing trunk-based development and code review workflows.
Core challenges you’ll face:
- Understanding the hook lifecycle → maps to understanding when Git runs each hook
- Managing hook installation → maps to understanding
.git/hooks/vs. tracked scripts - Handling hook failures → maps to understanding how hooks block operations
- Sharing hooks across a team → maps to understanding the
.gitnot being tracked problem
Key Concepts:
- Git hooks: Pro Git Ch. 8.3 — Scott Chacon
- Exit codes: Shell scripting fundamentals
- Process execution: The Linux Programming Interface Ch. 24 — Kerrisk
Difficulty: Intermediate Time estimate: 1 week Prerequisites: Shell scripting, understanding of Git basics
Real World Outcome
You’ll have a hooks framework that your team can use:
Example Output:
$ cat .git-hooks.yaml
hooks:
pre-commit:
- name: "Format check"
run: "npm run format:check"
- name: "Lint"
run: "npm run lint"
- name: "Type check"
run: "npm run typecheck"
commit-msg:
- name: "Conventional commit"
run: "./scripts/check-commit-msg.sh"
pre-push:
- name: "Tests"
run: "npm test"
- name: "Build"
run: "npm run build"
$ ./hooks-manager install
Installing hooks framework...
✓ Created .git/hooks/pre-commit
✓ Created .git/hooks/commit-msg
✓ Created .git/hooks/pre-push
Hooks installed successfully!
$ git commit -m "bad commit"
Running pre-commit hooks...
[1/3] Format check... ✓ (0.5s)
[2/3] Lint... ✗ FAILED (1.2s)
Error: ESLint found 3 errors:
src/index.ts:15 - Unexpected any type
src/utils.ts:8 - Missing return type
src/utils.ts:22 - Unused variable 'temp'
Pre-commit hook failed. Commit aborted.
Fix the issues above or use --no-verify to skip hooks.
$ # Fix issues...
$ git commit -m "feat: add user authentication"
Running pre-commit hooks...
[1/3] Format check... ✓ (0.5s)
[2/3] Lint... ✓ (1.1s)
[3/3] Type check... ✓ (2.3s)
Running commit-msg hooks...
[1/1] Conventional commit... ✓ (0.1s)
[feature 3a4b5c6] feat: add user authentication
3 files changed, 127 insertions(+)
The Core Question You’re Answering
“How do teams enforce code quality automatically, and why can’t Git hooks be shared through the repository?”
Before you write any code, sit with this question. The .git directory is not tracked by Git itself, so hooks don’t travel with the repo. This is why tools like Husky exist—to bridge tracked config files with untracked hook scripts.
Concepts You Must Understand First
Stop and research these before coding:
- Available Git Hooks
- What hooks exist (pre-commit, commit-msg, pre-push, post-merge, etc.)?
- What arguments does each hook receive?
- What does the exit code mean for each hook?
- Book Reference: “Pro Git” Ch. 8.3 — Chacon
- Hook Execution Context
- What’s the working directory when a hook runs?
- What environment variables are available?
- How do you access staged changes vs. working directory?
- Book Reference: “Pro Git” Ch. 8.3 — Chacon
- Exit Codes
- How do exit codes control whether Git proceeds?
- How do you propagate failures from child processes?
- What exit codes should your framework use?
- Book Reference: “The Linux Command Line” Ch. 27 — Shotts
Questions to Guide Your Design
Before implementing, think through these:
- Configuration
- Where will the config file live (
.git-hooks.yaml,.hooks/, package.json)? - How will users specify multiple scripts per hook?
- How will you handle hook arguments and stdin?
- Where will the config file live (
- Installation
- How will you install hooks to
.git/hooks/? - How will you avoid overwriting user’s custom hooks?
- How will you handle reinstallation on config changes?
- How will you install hooks to
- Execution
- How will you run multiple scripts and aggregate results?
- Should scripts run in parallel or serial?
- How will you display progress and output?
Thinking Exercise
Explore Git Hooks
Set up and test hooks manually:
git init hook-test && cd hook-test
echo "initial" > file.txt && git add . && git commit -m "init"
# Create a failing pre-commit hook
cat > .git/hooks/pre-commit << 'EOF'
#!/bin/bash
echo "Pre-commit hook running..."
echo "Checking for TODO comments..."
if grep -r "TODO" .; then
echo "ERROR: Found TODO comments!"
exit 1
fi
echo "All clear!"
exit 0
EOF
chmod +x .git/hooks/pre-commit
# Test it
echo "// TODO: fix this" >> file.txt
git add file.txt
git commit -m "test" # Should fail!
Questions while exploring:
- What exit code caused the commit to fail?
- What’s in
$GIT_INDEX_FILEduring the hook? - Try
git commit --no-verify— what happens? - Check what’s passed via stdin to
commit-msg
The Interview Questions They’ll Ask
Prepare to answer these:
- “How would you enforce that all commits pass linting before being pushed?”
- “Why can’t you just add your hooks to
.git/hooks/and commit them?” - “What’s the difference between pre-commit and pre-push hooks?”
- “How would you skip hooks for a work-in-progress commit?”
- “How do commit-msg hooks work, and how would you enforce conventional commits?”
Hints in Layers
Hint 1: Starting Point Your installed hook script should: read config file, determine which scripts to run, run them in order, and exit 0 only if all succeed.
Hint 2: Config Parsing YAML is nice for config. In bash, you might use simpler formats or shell out to Python for parsing.
Hint 3: Hook Arguments
For commit-msg, argument 1 is the path to the message file. For pre-push, stdin contains lines with local/remote refs.
Hint 4: Progress Display
Use ANSI colors and \r to overwrite lines. Show [1/3] Linting... then update to [1/3] Linting... ✓
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Git hooks | “Pro Git” by Chacon | Ch. 8.3 |
| Shell scripting | “The Linux Command Line” by Shotts | Ch. 24-27 |
| Process management | “The Linux Programming Interface” by Kerrisk | Ch. 24-28 |
Implementation Hints
Your installed hook script pattern:
#!/bin/bash
# This file is auto-generated - do not edit
HOOK_NAME=$(basename "$0")
CONFIG_FILE=".git-hooks.yaml"
if [ ! -f "$CONFIG_FILE" ]; then
exit 0 # No config, allow operation
fi
# Parse config, find scripts for this hook type
# Run each script in sequence
# Exit with first failure or 0 if all pass
For the installer:
- Read config file
- For each hook type with scripts, create
.git/hooks/{hookname} - Make each executable with
chmod +x - Optionally backup existing hooks
Learning milestones:
- You can create and trigger a simple hook → You understand hook basics
- You can install hooks from config → You understand the installation problem
- You can run multiple scripts per hook → You understand hook orchestration
- You can display progress and handle failures → You have a usable framework
Project 6: Trunk-Based Development Pipeline
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Python
- Alternative Programming Languages: Go, Bash, TypeScript
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: CI/CD / Feature Flags / Git Workflows
- Software or Tool: Git, GitHub Actions or similar
- Main Book: Accelerate by Nicole Forsgren
What you’ll build: A complete trunk-based development pipeline with feature flags, automated testing on every commit, and a CLI tool that manages short-lived branches and enforces trunk-based discipline.
Why it teaches Git workflows: Trunk-based development is how high-performing teams work—Google, Facebook, and Netflix all use variations of it. By implementing the tooling yourself, you’ll understand why long-lived branches cause merge pain and how feature flags enable shipping incomplete code safely.
Core challenges you’ll face:
- Implementing feature flags → maps to understanding how to hide incomplete features
- Enforcing short-lived branches → maps to understanding the cost of divergence
- Automating branch cleanup → maps to understanding branch lifecycle
- Building CI integration → maps to understanding continuous integration
Key Concepts:
- Trunk-based development: Accelerate Ch. 4 — Forsgren
- Feature flags: Continuous Delivery Ch. 10 — Humble & Farley
- CI/CD principles: The DevOps Handbook Ch. 3 — Kim et al.
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-5 completed, understanding of CI systems
Real World Outcome
You’ll have a CLI and supporting infrastructure for trunk-based development:
Example Output:
$ trunk init
Initializing trunk-based development for this repository...
✓ Created .trunk/config.yaml
✓ Created .trunk/feature-flags.json
✓ Set up branch policies in .github/settings.yaml
✓ Created GitHub Actions workflow
Trunk-based development enabled!
Main branch: main
Max branch age: 2 days
Feature flags file: .trunk/feature-flags.json
$ trunk branch create auth-improvements
Creating short-lived branch 'auth-improvements'...
✓ Branch created from main
✓ Tracking enabled (will warn if branch > 2 days)
✓ Upstream set to origin/main
Tip: Merge back to main within 2 days to stay trunk-based!
$ trunk status
=== Trunk Status ===
Main branch: main (12 commits ahead of last deploy)
Active branches:
auth-improvements (you) 0.5 days old ✓ fresh
user-profile (alice) 1.8 days old ⚠️ getting stale
legacy-cleanup (bob) 4.2 days old ❌ STALE - violates trunk-based
Feature flags:
new_auth_flow: enabled for: internal, beta-users
profile_v2: disabled (in development)
dark_mode: enabled for: 10% rollout
$ trunk flag create new-checkout-flow
Created feature flag 'new_checkout_flow' (disabled by default)
Updated .trunk/feature-flags.json:
{
"new_checkout_flow": {
"enabled": false,
"enabledFor": [],
"createdAt": "2024-01-15",
"owner": "you"
}
}
Usage in code:
if (isEnabled('new_checkout_flow')) {
// new code
}
$ trunk merge
Running pre-merge checks...
✓ Branch age: 0.5 days (ok)
✓ Tests passed
✓ No merge conflicts with main
✓ Code review approved
Squash-merging 3 commits into main...
[main abc1234] feat: improve auth flow (#127)
✓ Branch 'auth-improvements' merged and deleted
✓ Deployment triggered to staging
The Core Question You’re Answering
“Why do high-performing teams commit directly to main, and how do they ship incomplete features without breaking production?”
Before you write any code, sit with this question. The answer is feature flags plus CI/CD. Incomplete code goes to production but is hidden behind flags. This eliminates merge hell and enables true continuous integration.
Concepts You Must Understand First
Stop and research these before coding:
- Trunk-Based Development
- What defines trunk-based development vs. GitFlow?
- Why are short-lived branches (< 2 days) important?
- How do you handle work that takes longer than 2 days?
- Book Reference: “Accelerate” Ch. 4 — Forsgren
- Feature Flags
- What’s the difference between release flags and experiment flags?
- How do you gradually roll out a feature (canary deployment)?
- What’s the lifecycle of a feature flag?
- Book Reference: “Continuous Delivery” Ch. 10 — Humble & Farley
- Continuous Integration
- What’s the difference between CI and continuous delivery?
- Why must you build on every commit in trunk-based?
- How do you handle flaky tests?
- Book Reference: “The DevOps Handbook” Ch. 3 — Kim et al.
Questions to Guide Your Design
Before implementing, think through these:
- Branch Policies
- How will you track branch age?
- What should happen when a branch exceeds the limit?
- How do you handle exceptions (releases, hotfixes)?
- Feature Flags
- Where should flags be stored (code, config, external service)?
- How do you handle flag evaluation at runtime?
- How do you clean up old flags?
- CI Integration
- What workflows need to run on each commit?
- How do you handle test failures on main?
- How do you integrate with existing CI systems?
Thinking Exercise
Simulate a Trunk-Based Sprint
Plan how you’d implement a feature trunk-based:
Feature: Add password strength indicator to signup
Day 1: Create branch, add strength calculation logic (behind flag)
Commit to main (hidden behind flag, passes tests)
Day 2: Add UI component (behind flag), deploy to staging
Internal QA tests with flag enabled
Day 3: Enable for beta users, collect feedback
Day 4: Fix issues based on feedback, new commits to main
Day 5: Enable for 25% of users
Week 2: 100% rollout, remove feature flag, delete old code
Questions while planning:
- Where are the merge conflicts? (Answer: nowhere!)
- What if you find a bug during rollout?
- What if the feature needs to be reverted?
- How long was the branch alive? (1-2 days per micro-feature)
The Interview Questions They’ll Ask
Prepare to answer these:
- “Explain trunk-based development and why it reduces integration problems.”
- “How would you implement a feature flag system from scratch?”
- “What’s the difference between a feature flag and a configuration setting?”
- “How do you handle database migrations in trunk-based development?”
- “What testing strategies are essential for trunk-based development?”
Hints in Layers
Hint 1: Starting Point
Start with the branch age tracker. Use git log -1 --format=%ct to get the creation timestamp. Store branch metadata in .trunk/.
Hint 2: Feature Flags
A simple JSON file works for small teams. For runtime, load the JSON and expose an isEnabled(flagName, context) function.
Hint 3: CI Integration
Generate a GitHub Actions workflow that runs on push to main. Use the on: push trigger with proper caching for speed.
Hint 4: Merge Tooling
Your trunk merge should: check age, run tests locally, squash commits, push, and delete the remote branch.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Trunk-based development | “Accelerate” by Forsgren | Ch. 4 |
| Feature flags | “Continuous Delivery” by Humble & Farley | Ch. 10 |
| DevOps practices | “The DevOps Handbook” by Kim et al. | Ch. 3-5 |
Implementation Hints
Feature flag schema:
{
"flag_name": {
"enabled": false,
"enabledFor": ["user_123", "beta-testers"],
"percentage": 0,
"createdAt": "2024-01-15",
"owner": "alice@company.com"
}
}
Branch age tracking:
# Get branch creation time (approximate - first commit on branch not on main)
git log main..HEAD --format=%ct --reverse | head -1
Learning milestones:
- You can track branch age and warn on staleness → You understand branch discipline
- You can manage feature flags → You understand decoupling deploy from release
- You can integrate with CI → You understand continuous integration
- You can enforce policies automatically → You have a complete trunk-based setup
Project 7: Code Review Bot
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Python
- Alternative Programming Languages: Go, TypeScript, Rust
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: GitHub API / Static Analysis / Automation
- Software or Tool: GitHub, Git
- Main Book: Working Effectively with Legacy Code by Feathers
What you’ll build: An automated code review bot that comments on pull requests with actionable feedback—detecting large diffs, missing tests, style violations, and common anti-patterns.
Why it teaches Git workflows: Professional code review is the gatekeeper of code quality. By building a bot that performs automated reviews, you’ll understand what makes PRs easy or hard to review, why smaller PRs get approved faster, and how to structure changes for maximum reviewability.
Core challenges you’ll face:
- Accessing the GitHub API → maps to understanding how PR tools work
- Analyzing diffs programmatically → maps to understanding what reviewers look for
- Providing actionable feedback → maps to understanding effective code review
- Handling edge cases → maps to understanding real-world complexity
Key Concepts:
- GitHub API: GitHub REST/GraphQL API documentation
- Static analysis: Working Effectively with Legacy Code Ch. 13 — Feathers
- Code review best practices: The Pragmatic Programmer Ch. 7 — Hunt & Thomas
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: API integration experience, understanding of code analysis
Real World Outcome
You’ll have a bot that automatically reviews PRs:
Example Output:
=== Code Review Bot Report ===
PR #127: "Add user authentication"
Author: alice
Files changed: 8
Lines added: 347
Lines removed: 12
📊 SIZE ANALYSIS
⚠️ This PR is LARGE (347 lines added)
Consider splitting into smaller PRs for easier review.
Suggested split:
- auth/login.ts, auth/logout.ts (auth logic)
- components/LoginForm.tsx (UI)
- tests/* (test files)
📝 DIFF ANALYSIS
src/auth/login.ts:
Line 45: ⚠️ TODO comment found: "TODO: add rate limiting"
Consider creating an issue instead of a TODO.
Line 78: ⚠️ Hardcoded string "admin" detected
Consider using a constant or environment variable.
Line 112: ⚠️ No error handling for await expression
Consider wrapping in try/catch.
🧪 TEST COVERAGE
⚠️ New code in src/auth/ but no new tests in tests/auth/
Added files without corresponding tests:
- src/auth/login.ts
- src/auth/session.ts
Consider adding tests to maintain coverage.
📋 BEST PRACTICES
✓ Conventional commit message format
✓ No merge commits in PR
✓ Description includes context
⚠️ Missing "Testing" section in PR description
💬 AUTO-COMMENT POSTED TO PR:
"Thanks for the PR! I've found a few things to address:
- PR is large (347 lines) - consider splitting
- 2 TODO comments should be converted to issues
- Missing tests for new auth logic
- Please add a 'Testing' section to the description
See full analysis above. Happy to help if you have questions!"
The Core Question You’re Answering
“What makes code review effective, and how can automation enforce best practices without blocking legitimate work?”
Before you write any code, sit with this question. Good code review catches bugs, shares knowledge, and maintains quality—but bad code review is a bottleneck. Automation should handle the mechanical checks so humans can focus on design and logic.
Concepts You Must Understand First
Stop and research these before coding:
- GitHub Pull Request API
- How do you fetch PR details, files, and diff?
- How do you post comments on specific lines?
- What’s the difference between issue comments and review comments?
- Resource: GitHub REST API documentation
- Effective Code Review
- What makes a PR easy to review?
- What’s the ideal PR size?
- What should humans review vs. what should be automated?
- Book Reference: “The Pragmatic Programmer” Ch. 7 — Hunt & Thomas
- Static Analysis
- What patterns indicate potential bugs?
- How do you detect missing test coverage?
- What anti-patterns are machine-detectable?
- Book Reference: “Working Effectively with Legacy Code” Ch. 13 — Feathers
Questions to Guide Your Design
Before implementing, think through these:
- Trigger Mechanism
- How will the bot be triggered (webhook, polling, manual)?
- How do you authenticate with GitHub?
- How do you handle rate limiting?
- Analysis Types
- What checks are universal vs. project-specific?
- How do you configure checks per repository?
- How do you handle false positives?
- Feedback Delivery
- Should you comment on individual lines or summarize?
- How do you avoid being annoying (comment spam)?
- How do you handle re-reviews after changes?
Thinking Exercise
Analyze a Real PR
Find a PR on a popular open source project and analyze it:
- Open a PR with 20+ files changed
- For each file, note what you’d want automated checking for
- Identify patterns that humans shouldn’t have to catch manually
Questions while analyzing:
- How long did this PR take to get reviewed?
- Did reviewers comment on things a bot could catch?
- Would you have approved this PR as-is?
- What would make this PR easier to review?
The Interview Questions They’ll Ask
Prepare to answer these:
- “What makes a pull request easy or hard to review?”
- “How would you design a system to automatically assign code reviewers?”
- “What’s the tradeoff between thorough automated checks and developer velocity?”
- “How would you handle a bot that generates too many false positives?”
- “What code review aspects should remain human-only?”
Hints in Layers
Hint 1: Starting Point Use PyGithub or the REST API directly. Start with fetching PR details and printing file names.
Hint 2: Diff Analysis
The GitHub API returns diff hunks. Parse the @@ line numbers to know where changes occurred.
Hint 3: Pattern Detection Regular expressions work for simple patterns (TODO, FIXME, hardcoded strings). For language-specific analysis, consider AST parsing.
Hint 4: Commenting
Use the “pull request review” API to batch comments. Individual line comments go in comments, overall feedback goes in body.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Code smells | “Working Effectively with Legacy Code” by Feathers | Ch. 13-16 |
| Review best practices | “The Pragmatic Programmer” by Hunt & Thomas | Ch. 7 |
| API design | “Build APIs You Won’t Hate” by Sturgeon | Ch. 4-6 |
Implementation Hints
GitHub API patterns:
# Fetch PR
pr = repo.get_pull(pr_number)
files = pr.get_files()
for file in files:
# file.filename, file.patch, file.additions, file.deletions
analyze_diff(file.patch)
# Post review
pr.create_review(
body="Overall feedback here",
event="COMMENT", # or "APPROVE" or "REQUEST_CHANGES"
comments=[
{"path": "src/file.ts", "line": 45, "body": "Consider..."}
]
)
Common checks to implement:
- PR size (files, lines changed)
- TODO/FIXME comments
- Hardcoded credentials/secrets
- Missing test files
- Console.log / print statements
- Long functions
- Deep nesting
Learning milestones:
- You can fetch and parse PR data → You understand the GitHub API
- You can detect common issues in diffs → You understand static analysis
- You can post meaningful comments → You understand review communication
- Your bot helps without annoying → You understand the balance
Project 8: Monorepo Task Runner
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Rust
- Alternative Programming Languages: Go, TypeScript, Python
- Coolness Level: Level 5: Pure Magic
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 4: Expert
- Knowledge Area: Build Systems / Graph Algorithms / Caching
- Software or Tool: Git
- Main Book: Software Engineering at Google by Winters et al.
What you’ll build: A monorepo task runner (like a mini Turborepo or Nx) that detects which packages changed, runs only affected tests, and caches results to avoid redundant work.
Why it teaches Git workflows: Monorepos are how Google, Microsoft, and many large companies organize code. The challenges—scale, affected detection, incremental builds—teach you how Git can be used as more than version control; it becomes the source of truth for what changed.
Core challenges you’ll face:
- Detecting affected packages → maps to understanding dependency graphs + git diff
- Implementing task caching → maps to understanding content-addressable storage
- Parallel task execution → maps to understanding topological order in DAGs
- Handling workspace dependencies → maps to understanding package relationships
Key Concepts:
- Monorepo architecture: Software Engineering at Google Ch. 16-18 — Winters et al.
- Build caching: Bazel documentation on remote caching
- Graph algorithms: Grokking Algorithms Ch. 6 — Bhargava
Difficulty: Expert Time estimate: 1 month+ Prerequisites: Projects 1-7 completed, graph algorithms, understanding of build systems
Real World Outcome
You’ll have a task runner that makes monorepos manageable:
Example Output:
$ mono status
=== Monorepo Status ===
Packages (5):
packages/core - library, no changes
packages/utils - library, 2 files changed
packages/api - app, depends on core, utils
packages/web - app, depends on core, utils
packages/cli - app, depends on core
Dependency graph:
api ──→ core
└───→ utils
web ──→ core
└───→ utils
cli ──→ core
$ mono affected --base=main
Analyzing changes from main...
Changed files:
packages/utils/src/string.ts (+12 -3)
packages/utils/src/date.ts (+5 -2)
Affected packages (3):
packages/utils - directly changed
packages/api - depends on utils
packages/web - depends on utils
Unaffected packages (2):
packages/core - no dependency on changed files
packages/cli - no dependency on changed files
$ mono test --affected
Running tests for affected packages...
[1/3] Testing utils...
Using cached result from abc123 (12 tests, 0.8s ago)
✓ Skipped (cache hit)
Wait, utils changed! Invalidating cache...
Running 12 tests...
✓ 12 passed (2.3s)
Cache stored: def456
[2/3] Testing api...
Dependency utils changed, cache invalidated
Running 47 tests...
✓ 47 passed (8.1s)
Cache stored: ghi789
[3/3] Testing web...
Dependency utils changed, cache invalidated
Running 83 tests...
✓ 83 passed (12.4s)
Cache stored: jkl012
Summary:
3 packages tested
142 tests passed
Total time: 22.8s (without cache: ~45s)
Cache hit rate: 0% (invalidated by changes)
$ mono test --affected # Run again, nothing changed
All 3 affected packages have valid cache entries.
✓ Nothing to run (22.8s saved)
The Core Question You’re Answering
“How do you build and test only what changed in a codebase with hundreds of packages?”
Before you write any code, sit with this question. The answer combines Git diff to know what changed, a dependency graph to know what’s affected, and content-addressable caching to skip redundant work.
Concepts You Must Understand First
Stop and research these before coding:
- Package Dependency Graphs
- How do you model dependencies between packages?
- What’s the difference between dependencies and devDependencies?
- How do you detect circular dependencies?
- Book Reference: “Grokking Algorithms” Ch. 6 — Bhargava
- Affected Detection
- How do you use
git diffto find changed files? - How do you map files to packages?
- How do you propagate “affected” through the dependency graph?
- Book Reference: “Software Engineering at Google” Ch. 17 — Winters et al.
- How do you use
- Task Caching
- What inputs determine a task’s cache key?
- How do you store and retrieve cache entries?
- When is it safe to reuse a cached result?
- Resource: Turborepo documentation on caching
Questions to Guide Your Design
Before implementing, think through these:
- Package Discovery
- How will you find packages in the repo (package.json, Cargo.toml, etc.)?
- How will you extract dependencies?
- How will you handle different package managers?
- Cache Key Calculation
- What inputs affect a task’s output (source files, dependencies, config)?
- How do you hash all these inputs efficiently?
- Should the cache be local, remote, or both?
- Task Orchestration
- How do you respect dependency order (topological sort)?
- How do you parallelize independent tasks?
- How do you handle task failures?
Thinking Exercise
Trace a Change Through Dependencies
Map out a monorepo change manually:
packages/
shared-utils/ (dependency of everything)
auth-service/ (depends on: shared-utils)
user-api/ (depends on: shared-utils, auth-service)
web-app/ (depends on: shared-utils, user-api)
cli-tool/ (depends on: shared-utils)
Change: Edit packages/shared-utils/src/format.ts
Questions while tracing:
- Which packages need to be rebuilt?
- In what order should they be rebuilt?
- If you had cached builds from yesterday, which caches are now invalid?
- How many packages could you build in parallel?
The Interview Questions They’ll Ask
Prepare to answer these:
- “How would you design a build system for a monorepo with 100 packages?”
- “Explain how you would calculate a cache key for a build task.”
- “What’s the time complexity of detecting affected packages?”
- “How do you handle diamond dependencies in a monorepo?”
- “What are the tradeoffs between monorepos and polyrepos?”
Hints in Layers
Hint 1: Starting Point
Start with package discovery. Glob for package.json files, parse them, extract dependencies.
Hint 2: Dependency Graph
Build an adjacency list where graph[pkg] = list of packages that depend on pkg. For affected detection, traverse from changed packages.
Hint 3: Cache Key
Compute: hash(source_files_hash + dependencies_cache_keys + config_hash). If any input changes, the cache is invalid.
Hint 4: Execution Order Use Kahn’s algorithm for topological sort. Build a queue of packages with no pending dependencies; process and add newly unblocked packages.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Monorepo at scale | “Software Engineering at Google” by Winters et al. | Ch. 16-18 |
| Graph algorithms | “Grokking Algorithms” by Bhargava | Ch. 6 |
| Build system design | “The Bazel Book” (online) | Ch. 1-3 |
Implementation Hints
Cache key calculation:
def compute_cache_key(package):
hasher = hashlib.sha256()
# Hash source files
for file in get_source_files(package):
hasher.update(file_hash(file))
# Hash dependencies' cache keys (transitive)
for dep in get_dependencies(package):
hasher.update(compute_cache_key(dep))
# Hash config
hasher.update(config_hash(package))
return hasher.hexdigest()
Affected detection:
def get_affected(changed_files, base_ref):
# Map files to packages
changed_packages = set()
for file in changed_files:
pkg = get_package_for_file(file)
if pkg:
changed_packages.add(pkg)
# Find all dependents (reverse dependency graph)
affected = set(changed_packages)
queue = list(changed_packages)
while queue:
pkg = queue.pop(0)
for dependent in get_dependents(pkg):
if dependent not in affected:
affected.add(dependent)
queue.append(dependent)
return affected
Learning milestones:
- You can discover packages and build a dependency graph → You understand monorepo structure
- You can detect affected packages from git diff → You understand change propagation
- You can compute and use cache keys → You understand incremental builds
- You can run tasks in topological order → You have a working task runner
Project 9: Git Bisect Automator
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Python
- Alternative Programming Languages: Bash, Go, Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Binary Search / Git Bisect / Debugging
- Software or Tool: Git
- Main Book: Pro Git by Scott Chacon
What you’ll build: A tool that wraps git bisect to automatically find the commit that introduced a bug by running a test script, with support for skip detection, performance optimizations, and detailed reporting.
Why it teaches Git workflows: Bisect is Git’s killer debugging feature—it uses binary search to find the exact commit that broke something. By building a wrapper, you’ll understand how bisect works internally and how to leverage it effectively in real debugging scenarios.
Core challenges you’ll face:
- Driving git bisect programmatically → maps to understanding bisect’s state machine
- Handling untestable commits → maps to understanding git bisect skip
- Detecting flaky tests → maps to understanding real-world complexity
- Optimizing search → maps to understanding binary search on DAGs
Key Concepts:
- Git bisect: Pro Git Ch. 7.10 — Scott Chacon
- Binary search: Algorithms Ch. 1 — Sedgewick & Wayne
- Test automation: Continuous Delivery Ch. 8 — Humble & Farley
Difficulty: Intermediate Time estimate: 1 week Prerequisites: Basic Git, understanding of binary search
Real World Outcome
You’ll have a tool that makes bisect easy and informative:
Example Output:
$ auto-bisect --good v1.0.0 --bad HEAD --test "npm test"
=== Auto Bisect ===
Good: v1.0.0 (abc123)
Bad: HEAD (xyz789)
Calculating search space...
Commits between good and bad: 127
Expected bisect steps: ~7 (log₂(127) = 6.99)
Starting automated bisect...
Step 1/7: Testing commit def456 "Add user profile feature"
Running: npm test
Result: GOOD (tests pass)
Search space: 127 → 63 commits remaining
Step 2/7: Testing commit ghi789 "Refactor authentication"
Running: npm test
Result: BAD (tests fail)
Search space: 63 → 31 commits remaining
Step 3/7: Testing commit jkl012 "Update dependencies"
Running: npm test
Result: SKIP (build failed, can't test)
Skipping this commit, trying adjacent...
Testing commit jkl011 "Fix linting errors"
Result: GOOD (tests pass)
Search space: 31 → 15 commits remaining
... (steps 4-7)
Step 7/7: Testing commit mno345 "Fix login redirect"
Running: npm test
Result: BAD (tests fail)
=== BISECT COMPLETE ===
First bad commit: mno345
Author: Alice <alice@example.com>
Date: 2024-01-12 14:32:00
Fix login redirect
Changed the redirect URL after successful login
to use relative paths instead of absolute.
Changed files:
src/auth/login.ts (+5 -3)
src/routes/index.ts (+2 -1)
This commit likely introduced the bug!
Suggestion: Check the changes to src/auth/login.ts lines 45-52
$ auto-bisect --log
Previous bisect sessions:
2024-01-15: Found mno345 (7 steps, 2m 30s)
2024-01-10: Found abc123 (5 steps, 1m 45s)
The Core Question You’re Answering
“How does binary search apply to debugging, and how does Git leverage the commit graph for bisect?”
Before you write any code, sit with this question. Bisect works because Git history is ordered (parent relationships). Given a known-good and known-bad commit, you can binary search through the DAG to find where things went wrong.
Concepts You Must Understand First
Stop and research these before coding:
- Binary Search on DAGs
- How does bisect work when history isn’t linear?
- How does Git choose the midpoint in a merge-heavy history?
- What’s the worst-case complexity?
- Book Reference: “Pro Git” Ch. 7.10 — Chacon
- Git Bisect State
- Where does Git store bisect state?
- What are the bisect commands (start, good, bad, skip, reset)?
- How do you automate bisect with
git bisect run? - Book Reference: “Pro Git” Ch. 7.10 — Chacon
- Test Reliability
- What makes a test suitable for bisecting?
- How do you handle commits that can’t be tested (build failures)?
- How do you detect and handle flaky tests?
- Book Reference: “Continuous Delivery” Ch. 8 — Humble & Farley
Questions to Guide Your Design
Before implementing, think through these:
- Test Script Interface
- What exit codes should the test script use (0=good, 1-124=bad, 125=skip)?
- How do you handle timeouts?
- How do you capture and display test output?
- Bisect Optimization
- How can you speed up bisect (parallel builds, caching)?
- How do you minimize checkout operations?
- Can you pre-compute which commits are skippable?
- Reporting
- What information is most useful when bisect completes?
- How do you present the journey (steps taken)?
- How do you suggest next debugging steps?
Thinking Exercise
Walk Through Bisect Manually
Simulate bisect on paper:
Commit history: A ← B ← C ← D ← E ← F ← G ← H
good BAD bad
Start: good=A, bad=H (8 commits)
Questions while walking through:
- Step 1: Which commit does Git test first (midpoint)?
- If midpoint is BAD, what’s the new search range?
- If midpoint is GOOD, what’s the new search range?
- How many steps maximum to find the first bad commit?
- What if commit D can’t be built (skip)?
The Interview Questions They’ll Ask
Prepare to answer these:
- “Explain how git bisect works internally.”
- “What’s the time complexity of git bisect?”
- “How would you handle a situation where bisect identifies a merge commit as bad?”
- “What makes a good test script for automated bisect?”
- “How would you bisect a performance regression (not a pass/fail test)?”
Hints in Layers
Hint 1: Starting Point
Use git bisect run ./test-script.sh. Exit code 0 = good, 1-124 = bad, 125 = skip, 126-127 = abort.
Hint 2: Parsing Output Capture bisect output to track progress. Look for “Bisecting:” lines to know which commit is being tested.
Hint 3: Enhanced Reporting
After bisect completes, use git show --stat <bad-commit> to show what files changed.
Hint 4: Flaky Detection Run the test multiple times at a commit. If results are inconsistent, mark as flaky and skip.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Git bisect | “Pro Git” by Chacon | Ch. 7.10 |
| Binary search | “Algorithms” by Sedgewick | Ch. 1 |
| Test reliability | “Continuous Delivery” by Humble & Farley | Ch. 8 |
Implementation Hints
Basic automated bisect wrapper:
def auto_bisect(good, bad, test_cmd):
# Initialize
subprocess.run(["git", "bisect", "start"])
subprocess.run(["git", "bisect", "bad", bad])
subprocess.run(["git", "bisect", "good", good])
while True:
# Get current commit being tested
result = subprocess.run(
["git", "rev-parse", "HEAD"],
capture_output=True
)
current = result.stdout.decode().strip()
# Run test
test_result = subprocess.run(test_cmd, shell=True)
if test_result.returncode == 0:
mark = "good"
elif test_result.returncode < 125:
mark = "bad"
else:
mark = "skip"
# Tell bisect the result
result = subprocess.run(
["git", "bisect", mark],
capture_output=True
)
# Check if done
if "is the first bad commit" in result.stdout.decode():
break
# Cleanup
subprocess.run(["git", "bisect", "reset"])
Learning milestones:
- You can run basic git bisect manually → You understand bisect concepts
- You can automate bisect with a test script → You understand git bisect run
- You can handle skip cases gracefully → You understand real-world complexity
- You can generate useful reports → You have a debugging power tool
Project 10: Stacked PRs Manager
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Go
- Alternative Programming Languages: Python, Rust, TypeScript
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Git Rebase / GitHub API / Workflow Automation
- Software or Tool: Git, GitHub
- Main Book: Pro Git by Scott Chacon
What you’ll build: A tool for managing stacked (dependent) pull requests—where PR 2 depends on PR 1, PR 3 depends on PR 2, etc. The tool handles rebasing the stack when upstream changes, updating PR descriptions with dependency info, and orchestrating merges in order.
Why it teaches Git workflows: Stacked PRs are essential for large features that need to be reviewed incrementally. Managing them manually is error-prone; this project teaches you the rebase mechanics and the orchestration needed to keep a chain of branches synchronized.
Core challenges you’ll face:
- Tracking stack relationships → maps to understanding branch dependencies
- Cascading rebases → maps to understanding how rebasing affects dependents
- Updating PR descriptions → maps to understanding GitHub API and automation
- Orchestrating merges → maps to understanding ordered merge operations
Key Concepts:
- Rebase mechanics: Pro Git Ch. 3.6, 7.6 — Scott Chacon
- Dependent branches: Graphite/Stacked PRs concepts
- GitHub API: GitHub REST API documentation
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-7 completed, strong rebase understanding
Real World Outcome
You’ll have a CLI for managing PR stacks:
Example Output:
$ stack create auth-refactor
Created stack 'auth-refactor' based on main
$ stack branch add-user-model
Created branch 'auth-refactor/add-user-model' in stack
# ... make commits ...
$ stack branch add-login-endpoint
Created branch 'auth-refactor/add-login-endpoint' on top of 'add-user-model'
# ... make commits ...
$ stack branch add-login-ui
Created branch 'auth-refactor/add-login-ui' on top of 'add-login-endpoint'
# ... make commits ...
$ stack status
Stack: auth-refactor (based on main)
main
└── add-user-model (3 commits, PR #127)
└── add-login-endpoint (5 commits, PR #128)
└── add-login-ui (4 commits, PR #129) ← HEAD
All branches up to date with their bases.
$ stack push
Pushing stack to origin...
✓ Pushed auth-refactor/add-user-model
✓ Pushed auth-refactor/add-login-endpoint
✓ Pushed auth-refactor/add-login-ui
Creating/updating PRs...
✓ PR #127: Add user model
Base: main
Description updated with stack info
✓ PR #128: Add login endpoint
Base: auth-refactor/add-user-model
Description updated with stack info:
"⬆️ Depends on: #127 (Add user model)
⬇️ Blocks: #129 (Add login UI)"
✓ PR #129: Add login UI
Base: auth-refactor/add-login-endpoint
Description updated with stack info:
"⬆️ Depends on: #128 (Add login endpoint)"
$ # Someone merges PR #127...
$ stack sync
Syncing stack with origin...
Fetching updates...
PR #127 was merged to main!
Rebasing stack...
Rebasing add-login-endpoint onto main...
✓ Rebased successfully (was based on add-user-model)
Rebasing add-login-ui onto add-login-endpoint...
✓ Rebased successfully
Updating PRs...
✓ PR #128: Base changed to main (was add-user-model)
✓ PR #129: No changes needed
Stack synced! Ready for more merges.
The Core Question You’re Answering
“How do you break large features into reviewable chunks while keeping them synchronized?”
Before you write any code, sit with this question. Large PRs are hard to review, but splitting a feature into dependent PRs creates a maintenance burden. The solution is tooling that automates the cascade.
Concepts You Must Understand First
Stop and research these before coding:
- Branch Dependencies
- How do you model “branch B is based on branch A”?
- What happens to B when A gets new commits?
- What happens when A is rebased or merged?
- Book Reference: “Pro Git” Ch. 3.6 — Chacon
- Cascading Rebase
- How do you rebase a chain of branches?
- What order should you rebase in?
- How do you handle conflicts in the middle of a chain?
- Book Reference: “Pro Git” Ch. 7.6 — Chacon
- PR Base Branches
- How do you set a PR’s base branch to another branch (not main)?
- What happens to a PR when its base branch is merged?
- How do you update a PR’s base branch via API?
- Resource: GitHub REST API documentation
Questions to Guide Your Design
Before implementing, think through these:
- Stack Representation
- How will you store the stack metadata (which branch is on top of which)?
- Should this be in .git, a config file, or derived from branch names?
- How do you handle branches that are part of multiple stacks?
- Sync Algorithm
- When main changes, how do you detect which stacks need updating?
- When a PR is merged, how do you update the stack?
- What if a rebase has conflicts?
- PR Management
- How do you set up a PR with a non-main base branch?
- How do you generate the “depends on/blocks” description?
- How do you keep PR descriptions in sync with stack state?
Thinking Exercise
Trace a Stack Update
Simulate a stack update on paper:
Initial state:
main ← A ← B (add-user-model)
└← C ← D (add-login-endpoint)
└← E ← F (add-login-ui)
Action: Main gets new commit X
main ← X
Desired state:
main ← X ← A' ← B' (add-user-model)
└← C' ← D' (add-login-endpoint)
└← E' ← F' (add-login-ui)
Questions while tracing:
- In what order do you rebase the branches?
- What commands do you run for each rebase?
- Why are A’, B’, C’, etc. different commits (different SHAs)?
- What if commit C conflicts with commit X?
The Interview Questions They’ll Ask
Prepare to answer these:
- “How would you design a system for managing dependent pull requests?”
- “What happens when a branch in the middle of a stack gets merged?”
- “How do you handle rebase conflicts in a stack of branches?”
- “What are the tradeoffs between stacked PRs and a single large PR?”
- “How would you implement ‘stacking’ without special tooling?”
Hints in Layers
Hint 1: Starting Point
Store stack metadata in .git/stack-info/. For each stack, record the ordered list of branches.
Hint 2: Rebase Order Always rebase from bottom of stack to top. If you rebase the top first, you’ll have to redo it when you rebase its base.
Hint 3: Detecting Merges Check if a branch’s base exists on the remote. If the base is merged into main, main becomes the new base.
Hint 4: PR Updates
Use gh api or PyGithub to update PR base branches. When base is merged, GitHub automatically retargets to main.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Rebase mechanics | “Pro Git” by Chacon | Ch. 3.6, 7.6 |
| Branch workflows | “Pro Git” by Chacon | Ch. 5.1-5.3 |
| GitHub CLI | gh documentation | — |
Implementation Hints
Stack metadata format:
# .git/stack-info/auth-refactor.yaml
name: auth-refactor
base: main
branches:
- name: add-user-model
pr: 127
- name: add-login-endpoint
pr: 128
- name: add-login-ui
pr: 129
Cascade rebase algorithm:
def sync_stack(stack):
base = stack.base # "main"
for branch in stack.branches:
# Rebase this branch onto its base
subprocess.run([
"git", "rebase",
"--onto", base,
f"{base}..{branch.name}"
])
# This branch becomes the base for the next one
base = branch.name
Learning milestones:
- You can track stack relationships → You understand branch dependencies
- You can cascade rebases correctly → You understand rebase mechanics deeply
- You can sync PRs with stack state → You understand GitHub API integration
- Your tool handles the merge case → You have a production-ready stacked PR tool
Project 11: Conventional Commits Enforcer
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Rust
- Alternative Programming Languages: Go, Python, TypeScript
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Parsing / Semantic Versioning / Git Hooks
- Software or Tool: Git
- Main Book: Pro Git by Scott Chacon
What you’ll build: A commit message linter that enforces the Conventional Commits specification, generates changelogs automatically, and determines semantic version bumps based on commit types.
Why it teaches Git workflows: Conventional commits are the foundation of automated releases. By building the parser and enforcer yourself, you’ll understand how tools like semantic-release, commitlint, and changelog generators work—and why consistent commit messages enable powerful automation.
Core challenges you’ll face:
- Parsing commit message format → maps to understanding the Conventional Commits spec
- Determining version bumps → maps to understanding semantic versioning
- Generating changelogs → maps to understanding release automation
- Integrating with hooks → maps to understanding enforcement mechanisms
Key Concepts:
- Conventional Commits: conventionalcommits.org specification
- Semantic Versioning: semver.org
- Changelog generation: keepachangelog.com
Difficulty: Intermediate Time estimate: 1 week Prerequisites: Understanding of regex, basic parsing concepts
Real World Outcome
You’ll have a complete commit linting and changelog system:
Example Output:
$ commit-lint "Add new feature"
❌ Invalid commit message
Error: Missing type prefix
Expected format: <type>[optional scope]: <description>
Valid types: feat, fix, docs, style, refactor, test, chore
Example: "feat: add new feature"
$ commit-lint "feat: add user authentication"
✓ Valid commit message
Type: feat (minor version bump)
Scope: none
Description: add user authentication
Breaking: no
$ commit-lint "fix(auth)!: handle token expiration correctly"
✓ Valid commit message
Type: fix (patch version bump)
Scope: auth
Description: handle token expiration correctly
Breaking: YES (major version bump)
$ changelog generate --from v1.2.0 --to HEAD
# Changelog
## [1.3.0] - 2024-01-15
### Features
- **auth:** add two-factor authentication (#127)
- add password strength indicator (#125)
### Bug Fixes
- **api:** fix rate limiting calculation (#130)
- handle edge case in date parsing (#128)
### Documentation
- update API reference for auth endpoints (#131)
### BREAKING CHANGES
- **auth:** token format changed, clients must update (#126)
---
$ semver suggest --from v1.2.0
Analyzing commits since v1.2.0...
Commits analyzed: 12
- feat: 4 (minor bump)
- fix: 5 (patch bump)
- docs: 2 (no bump)
- feat!: 1 (BREAKING - major bump)
Suggested next version: 2.0.0 (major bump due to breaking change)
Breaking commit:
abc123 feat(auth)!: token format changed, clients must update
The Core Question You’re Answering
“How do you turn commit messages into automated releases and changelogs?”
Before you write any code, sit with this question. The answer is conventions. When every commit follows a pattern, machines can parse them to determine what changed, how to version it, and what to tell users.
Concepts You Must Understand First
Stop and research these before coding:
- Conventional Commits Specification
- What are the required elements (type, description)?
- What are the optional elements (scope, body, footer)?
- How do you indicate breaking changes?
- Resource: conventionalcommits.org
- Semantic Versioning
- What do MAJOR.MINOR.PATCH mean?
- When do you bump each number?
- What’s a pre-release version?
- Resource: semver.org
- Changelog Best Practices
- What sections should a changelog have?
- How do you group changes by type?
- What makes a changelog human-readable?
- Resource: keepachangelog.com
Questions to Guide Your Design
Before implementing, think through these:
- Parsing
- How do you handle multi-line commit messages?
- How do you extract the body vs. footers?
- What regex pattern matches the Conventional Commits format?
- Version Determination
- What if there are multiple breaking changes?
- How do you handle version ranges (e.g., v1.0.0 to v2.0.0)?
- What about pre-release versions?
- Integration
- How do you make this work as a commit-msg hook?
- How do you handle commits that bypass hooks?
- Should CI also validate commit messages?
Thinking Exercise
Parse Example Commits
Analyze these commit messages:
1. feat(auth): add OAuth2 support
2. fix: resolve memory leak in worker pool
3. feat!: redesign API response format
4. docs(readme): update installation instructions
5. chore(deps): bump lodash from 4.17.20 to 4.17.21
6. refactor(core): extract validation logic
This change moves validation into a separate module
for better testability.
BREAKING CHANGE: ValidationError now includes error codes
Reviewed-by: Alice
Refs: #123
Questions while parsing:
- What’s the type, scope, and description for each?
- Which commits bump which version component?
- How do you extract the body from commit 6?
- How do you detect the BREAKING CHANGE footer?
The Interview Questions They’ll Ask
Prepare to answer these:
- “Why are conventional commits useful for automated releases?”
- “How would you parse a commit message to extract the type and scope?”
- “Explain semantic versioning and when you’d bump each number.”
- “How would you handle enforcing commit conventions in a large team?”
- “What’s the difference between a breaking change in the type (!) and in the footer?”
Hints in Layers
Hint 1: Starting Point
The basic regex: ^(\w+)(\(.+\))?!?: (.+)$. This captures type, optional scope, and description.
Hint 2: Body and Footers
Split on double newlines. First paragraph is the subject. Remaining paragraphs are body unless they match key: value or BREAKING CHANGE:.
Hint 3: Version Bump Logic
BREAKING CHANGE or ! → major. feat → minor. fix → patch. Everything else → no bump.
Hint 4: Changelog Grouping
Use a map: {feat: [], fix: [], docs: []}. Iterate commits, append to appropriate bucket, then render.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Git hooks | “Pro Git” by Chacon | Ch. 8.3 |
| Regex parsing | “Mastering Regular Expressions” by Friedl | Ch. 2-3 |
| Release automation | “Continuous Delivery” by Humble & Farley | Ch. 5 |
Implementation Hints
Conventional Commits parser:
struct ConventionalCommit {
type_: String,
scope: Option<String>,
description: String,
body: Option<String>,
breaking: bool,
footers: Vec<(String, String)>,
}
fn parse(message: &str) -> Result<ConventionalCommit, Error> {
let lines: Vec<&str> = message.lines().collect();
let subject = lines[0];
// Parse subject line
let re = Regex::new(r"^(\w+)(\(.+\))?(!)?:\s*(.+)$")?;
let caps = re.captures(subject).ok_or("Invalid format")?;
let type_ = caps[1].to_string();
let scope = caps.get(2).map(|m| m.as_str().trim_matches(&['(', ')'][..]).to_string());
let breaking = caps.get(3).is_some();
let description = caps[4].to_string();
// Parse body and footers from remaining lines...
}
Learning milestones:
- You can parse conventional commit format → You understand the specification
- You can determine version bumps → You understand semver
- You can generate changelogs → You understand release automation
- You can enforce via hooks → You have a complete system
Project 12: Git Worktree Manager
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Python, Bash
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Git Worktrees / Filesystem / Productivity
- Software or Tool: Git
- Main Book: Pro Git by Scott Chacon
What you’ll build: A TUI (text UI) tool for managing Git worktrees—allowing you to have multiple branches checked out simultaneously in separate directories, with easy creation, switching, and cleanup.
Why it teaches Git workflows: Worktrees are Git’s hidden superpower for working on multiple branches simultaneously. By building a manager, you’ll understand how worktrees relate to the .git directory, when to use them vs. stash, and how they enable parallel development workflows.
Core challenges you’ll face:
- Understanding worktree mechanics → maps to understanding .git and working directory separation
- Managing multiple checkouts → maps to understanding branch locking
- Building a usable TUI → maps to understanding developer tooling UX
- Cleaning up orphaned worktrees → maps to understanding Git garbage collection
Key Concepts:
- Git worktrees: Pro Git Ch. 7.11 — Scott Chacon
- TUI development: bubble tea (Go), ratatui (Rust), or similar
- Branch references: Git refs and HEAD management
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Understanding of Git branches, basic TUI or CLI experience
Real World Outcome
You’ll have a TUI for managing worktrees:
Example Output:
$ wt
┌─ Git Worktree Manager ─────────────────────────────────┐
│ │
│ Repository: my-project │
│ Main worktree: /Users/dev/my-project │
│ │
│ Worktrees: │
│ ┌────────────────────────────────────────────────────┐│
│ │ ● main /Users/dev/my-project [M] ││
│ │ feature/auth /Users/dev/my-project-auth ││
│ │ hotfix/bug /Users/dev/my-project-hotfix ││
│ │ experiment /Users/dev/my-project-exp [*] ││
│ └────────────────────────────────────────────────────┘│
│ │
│ [M] = main worktree [*] = current │
│ │
│ Commands: │
│ [n] New worktree [d] Delete [o] Open in editor │
│ [c] Clean up [r] Refresh [q] Quit │
│ │
└────────────────────────────────────────────────────────┘
> n (new worktree)
Branch name (or new branch): feature/new-api
Directory (default: ../my-project-new-api):
Creating worktree...
✓ Created worktree at /Users/dev/my-project-new-api
✓ Checked out branch feature/new-api
Open in editor? [y/N]: y
✓ Opened in VS Code
$ wt status
3 worktrees active:
main → /Users/dev/my-project (main worktree)
feature/auth → /Users/dev/my-project-auth (2 days old, 5 commits)
feature/new → /Users/dev/my-project-new-api (just created)
$ wt clean
Scanning for orphaned worktrees...
Found 1 orphaned worktree:
/Users/dev/my-project-old-feature (branch deleted)
Remove orphaned worktrees? [y/N]: y
✓ Removed 1 orphaned worktree
The Core Question You’re Answering
“How do you work on multiple branches simultaneously without constant switching?”
Before you write any code, sit with this question. Worktrees let you have multiple working directories sharing one .git database. Each worktree is an independent checkout—you can build one branch while testing another.
Concepts You Must Understand First
Stop and research these before coding:
- Worktree Mechanics
- How does a worktree relate to the main
.gitdirectory? - Why can’t two worktrees have the same branch checked out?
- Where does Git store worktree metadata?
- Book Reference: “Pro Git” Ch. 7.11 — Chacon
- How does a worktree relate to the main
- Branch Locking
- What error do you get when trying to checkout a branch that’s in another worktree?
- How do you find which worktree has a branch?
- What happens when you delete a branch that’s checked out in a worktree?
- Resource:
git worktree --help
- TUI Development
- How do you handle keyboard input in a terminal?
- How do you draw boxes and update the screen?
- What libraries exist for your language?
- Resource: Bubble Tea (Go), ratatui (Rust), or curses (Python)
Questions to Guide Your Design
Before implementing, think through these:
- Worktree Creation
- Should you create a new branch or checkout an existing one?
- What’s a sensible naming convention for worktree directories?
- How do you handle creation in a specific path vs. automatic path?
- Navigation
- How do you switch between worktrees (cd, open editor, etc.)?
- How do you show which worktree is “current”?
- How do you handle worktrees on remote branches?
- Cleanup
- How do you detect orphaned worktrees (deleted branch, missing directory)?
- Should cleanup be automatic or manual?
- How do you handle worktrees with uncommitted changes?
Thinking Exercise
Explore Worktrees Manually
Set up worktrees and explore:
git init worktree-test && cd worktree-test
echo "main" > file.txt && git add . && git commit -m "initial"
git branch feature-a
git branch feature-b
# Create worktrees
git worktree add ../test-feature-a feature-a
git worktree add ../test-feature-b feature-b
# Explore
git worktree list
ls -la .git/worktrees/
cat .git/worktrees/test-feature-a/HEAD
Questions while exploring:
- What’s in
.git/worktrees/? - What happens if you try
git checkout feature-ain the main worktree? - If you delete
../test-feature-a/, what doesgit worktree listshow? - What does
git worktree prunedo?
The Interview Questions They’ll Ask
Prepare to answer these:
- “What are Git worktrees and when would you use them?”
- “How do worktrees differ from just cloning the repository again?”
- “What happens to a worktree when its branch is deleted?”
- “How would you use worktrees in a CI/CD pipeline?”
- “What’s the relationship between the main
.gitdirectory and worktree checkouts?”
Hints in Layers
Hint 1: Starting Point
git worktree list --porcelain gives machine-parseable output. Start by parsing this.
Hint 2: Worktree Info
The porcelain output has worktree, HEAD, branch lines for each worktree. Parse into structs.
Hint 3: Creation
git worktree add <path> <branch> creates a worktree. Add -b <branch> to create a new branch.
Hint 4: TUI Framework Use a framework rather than raw terminal codes. In Go, Bubble Tea is excellent. In Rust, ratatui.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Git worktrees | “Pro Git” by Chacon | Ch. 7.11 |
| TUI development | Bubble Tea documentation | Getting started |
| CLI design | “The Linux Command Line” by Shotts | Ch. 30 |
Implementation Hints
Worktree list parsing:
type Worktree struct {
Path string
Head string
Branch string
}
func listWorktrees() ([]Worktree, error) {
out, _ := exec.Command("git", "worktree", "list", "--porcelain").Output()
lines := strings.Split(string(out), "\n")
var worktrees []Worktree
var current Worktree
for _, line := range lines {
if strings.HasPrefix(line, "worktree ") {
current.Path = strings.TrimPrefix(line, "worktree ")
} else if strings.HasPrefix(line, "HEAD ") {
current.Head = strings.TrimPrefix(line, "HEAD ")
} else if strings.HasPrefix(line, "branch ") {
current.Branch = strings.TrimPrefix(line, "branch refs/heads/")
} else if line == "" && current.Path != "" {
worktrees = append(worktrees, current)
current = Worktree{}
}
}
return worktrees, nil
}
Learning milestones:
- You can list and parse worktrees → You understand worktree metadata
- You can create and delete worktrees → You understand worktree lifecycle
- You can detect orphans and clean up → You understand worktree maintenance
- Your TUI is pleasant to use → You have a productivity tool
Project 13: Repository Analytics Dashboard
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Python
- Alternative Programming Languages: TypeScript, Go, Rust
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Data Analysis / Git Log Parsing / Visualization
- Software or Tool: Git, Matplotlib/D3.js
- Main Book: Software Engineering at Google by Winters et al.
What you’ll build: A dashboard that analyzes repository history to show contribution patterns, code hotspots, team dynamics, and technical debt indicators—like a mini version of GitPrime or Pluralsight Flow.
Why it teaches Git workflows: Understanding how a team uses Git reveals workflow health. By mining git log data, you’ll see how commit frequency, merge patterns, and contributor activity reflect team practices—and how to improve them.
Core challenges you’ll face:
- Parsing git log efficiently → maps to understanding Git’s output formats
- Calculating metrics → maps to understanding software engineering metrics
- Visualizing trends → maps to understanding data presentation
- Detecting patterns → maps to understanding code evolution
Key Concepts:
- Git log formats: Pro Git Ch. 2.3 — Chacon
- Software metrics: Software Engineering at Google Ch. 7 — Winters et al.
- Data visualization: Matplotlib/D3.js documentation
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Data analysis basics, understanding of Git history
Real World Outcome
You’ll have a dashboard that reveals repository insights:
Example Output:
$ repo-analytics /path/to/repo --period "last 6 months"
=== Repository Analytics Dashboard ===
📊 OVERVIEW
─────────────────────────────────────────
Repository: awesome-project
Period: Jul 2024 - Jan 2025 (6 months)
Commits: 847
Contributors: 12
Lines changed: +45,231 / -12,847
📈 COMMIT ACTIVITY
─────────────────────────────────────────
Monthly commits:
Jul ████████████████████ 156
Aug ████████████████ 132
Sep ███████████████████ 148
Oct ██████████████████████ 178
Nov ███████████████ 121
Dec ██████████ 82 (holiday season)
Jan ████████ 30 (partial month)
Peak day: Tuesdays (avg 8.2 commits/day)
Quietest: Weekends (avg 0.8 commits/day)
👥 TOP CONTRIBUTORS
─────────────────────────────────────────
alice ████████████████ 287 commits (34%)
bob ██████████ 178 commits (21%)
charlie ████████ 142 commits (17%)
diana ██████ 98 commits (12%)
others ████ 142 commits (16%)
🔥 CODE HOTSPOTS (most frequently changed)
─────────────────────────────────────────
src/api/handlers.ts Modified 47 times by 6 authors
src/core/parser.ts Modified 38 times by 4 authors
src/utils/validation.ts Modified 35 times by 8 authors
⚠️ High churn files may indicate:
- Complex logic needing simplification
- Missing tests causing bugs
- Feature instability
📉 TECHNICAL DEBT INDICATORS
─────────────────────────────────────────
TODO/FIXME comments added: 23
TODO/FIXME comments removed: 8
Net debt: +15 (growing)
Large commits (>500 lines): 12
"WIP" or "fix" commits: 34
Merge conflicts resolved: 28
Reverted commits: 4
🔀 MERGE PATTERNS
─────────────────────────────────────────
Merge commits: 89
Squash merges: 156
Rebase merges: 42
Average PR size: 127 lines
Average review time: 1.8 days
PRs merged without review: 12 (7%)
📁 FILE TYPE DISTRIBUTION
─────────────────────────────────────────
TypeScript: 67% (24,521 LOC)
JSON: 12% (4,891 LOC)
Markdown: 8% (3,211 LOC)
YAML: 5% (1,678 LOC)
Other: 8%
The Core Question You’re Answering
“What does Git history reveal about a team’s development practices and code health?”
Before you write any code, sit with this question. Every commit tells a story—patterns in commits, authors, file changes, and timing reveal how a team works, where problems lurk, and what might need attention.
Concepts You Must Understand First
Stop and research these before coding:
- Git Log Formats
- What fields can you extract from git log?
- How do you use
--formatfor custom output? - How do you efficiently iterate through large histories?
- Book Reference: “Pro Git” Ch. 2.3 — Chacon
- Software Engineering Metrics
- What’s code churn and why does it matter?
- What’s bus factor and how do you calculate it?
- What metrics indicate healthy vs. unhealthy repos?
- Book Reference: “Software Engineering at Google” Ch. 7 — Winters et al.
- Data Visualization
- How do you choose the right chart type?
- How do you present trends over time?
- How do you make terminal-based visualizations?
- Resource: Matplotlib / D3.js documentation
Questions to Guide Your Design
Before implementing, think through these:
- Data Collection
- What git commands give you the data you need?
- How do you handle repositories with millions of commits?
- How do you normalize data across different time periods?
- Metric Calculation
- What metrics are genuinely useful vs. vanity metrics?
- How do you account for different commit styles (small vs. large)?
- How do you identify outliers (bot commits, merges, etc.)?
- Presentation
- Should output be terminal, HTML, or JSON?
- How do you make insights actionable?
- What warnings or recommendations should you provide?
Thinking Exercise
Analyze a Real Repository
Pick an open source repository and analyze manually:
# Clone a popular project
git clone https://github.com/microsoft/vscode --depth 1000
# Analyze
git log --format="%H|%an|%ae|%at|%s" --numstat | head -100
git shortlog -sn | head -10
git log --since="6 months ago" --oneline | wc -l
Questions while analyzing:
- Who are the top contributors?
- What files change most frequently?
- What patterns do you see in commit messages?
- What would you want to know about this project’s health?
The Interview Questions They’ll Ask
Prepare to answer these:
- “What metrics would you track to measure developer productivity?”
- “How would you identify technical debt from Git history?”
- “What does high code churn indicate, and is it always bad?”
- “How would you calculate the ‘bus factor’ for a repository?”
- “What insights can you derive from merge patterns?”
Hints in Layers
Hint 1: Starting Point
Use git log --format="%H|%an|%at|%s" --numstat to get commits with file stats. Parse the output into structured data.
Hint 2: Performance
For large repos, use --since and --until to limit scope. Process in streams, don’t load everything into memory.
Hint 3: Hotspot Detection Count how often each file appears in commits. Files with high counts AND multiple authors are likely complex.
Hint 4: Terminal Charts Use Unicode block characters (▁▂▃▄▅▆▇█) for simple bar charts. Libraries like asciichart can help.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Git log | “Pro Git” by Chacon | Ch. 2.3 |
| Software metrics | “Software Engineering at Google” by Winters et al. | Ch. 7 |
| Data visualization | “The Visual Display of Quantitative Information” by Tufte | Ch. 1-3 |
Implementation Hints
Git log parsing:
def parse_commits(repo_path, since=None):
cmd = ["git", "-C", repo_path, "log",
"--format=%H|%an|%at|%s"]
if since:
cmd.append(f"--since={since}")
result = subprocess.run(cmd, capture_output=True, text=True)
commits = []
for line in result.stdout.strip().split("\n"):
if "|" in line:
hash_, author, timestamp, subject = line.split("|", 3)
commits.append({
"hash": hash_,
"author": author,
"timestamp": int(timestamp),
"subject": subject
})
return commits
Learning milestones:
- You can parse git log efficiently → You understand Git’s output formats
- You can calculate meaningful metrics → You understand software engineering metrics
- You can visualize trends → You understand data presentation
- Your insights are actionable → You have a useful analytics tool
Project 14: Git Secret Scanner
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Rust
- Alternative Programming Languages: Go, Python
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Security / Pattern Matching / Git History
- Software or Tool: Git
- Main Book: Practical Binary Analysis by Andriesse
What you’ll build: A tool that scans Git history (not just current files) for accidentally committed secrets—API keys, passwords, tokens—and can optionally help remove them from history.
Why it teaches Git workflows: Security is a critical part of Git workflows. By building a scanner, you’ll understand how secrets persist in history even after deletion, how tools like git-secrets and truffleHog work, and how to use git filter-branch or BFG to rewrite history.
Core challenges you’ll face:
- Pattern matching for secrets → maps to understanding secret patterns
- Scanning all of history efficiently → maps to understanding Git object traversal
- History rewriting → maps to understanding filter-branch and BFG
- Minimizing false positives → maps to understanding entropy analysis
Key Concepts:
- Secret patterns: Regular expressions for common secret formats
- Entropy analysis: High-entropy strings are likely secrets
- History rewriting: Pro Git Ch. 7.6 — Chacon (filter-branch)
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Regex, understanding of security basics, Project 1 completed
Real World Outcome
You’ll have a security tool for Git repositories:
Example Output:
$ secret-scan /path/to/repo
=== Git Secret Scanner ===
Scanning repository: my-project
Mode: Full history scan
Scanning 1,247 commits...
[████████████████████████████████████████] 100%
⚠️ SECRETS FOUND: 7
HIGH SEVERITY:
─────────────────────────────────────────
1. AWS Access Key
Commit: abc1234 (2023-06-15)
Author: alice@example.com
File: config/prod.env (line 12)
Pattern: AKIAIOSFODNN7EXAMPLE
Status: ❌ Still in current HEAD
2. GitHub Personal Access Token
Commit: def5678 (2023-08-22)
Author: bob@example.com
File: scripts/deploy.sh (line 45)
Pattern: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Status: ✓ Deleted in later commit (but still in history!)
MEDIUM SEVERITY:
─────────────────────────────────────────
3. Private RSA Key
Commit: ghi9012 (2023-09-01)
File: .ssh/id_rsa
Status: ✓ Removed via .gitignore
4-7. ... (additional findings)
=== RECOMMENDATIONS ===
IMMEDIATE ACTIONS:
1. Rotate the AWS key AKIAIOSFODNN7EXAMPLE immediately
2. Revoke the GitHub token ghp_xxx...
HISTORY CLEANUP:
To remove secrets from history, run:
$ secret-scan clean --commits abc1234,def5678
⚠️ This will rewrite history. Coordinate with your team!
All clones will need to re-clone or rebase.
$ secret-scan clean --commits abc1234
Rewriting history to remove secrets...
Using BFG Repo-Cleaner strategy...
Before: abc1234 → config/prod.env contains AKIAIOSFODNN7...
After: xyz7890 → config/prod.env contains ***REDACTED***
⚠️ Force push required:
git push --force-with-lease origin main
Secret successfully removed from history!
The Core Question You’re Answering
“How do secrets persist in Git history, and how do you find and remove them?”
Before you write any code, sit with this question. When you delete a file with secrets, Git still has the old version in its history. Even after a commit is no longer reachable from any branch, it exists in the object database until garbage collection.
Concepts You Must Understand First
Stop and research these before coding:
- Common Secret Patterns
- What regex patterns match AWS keys, GitHub tokens, etc.?
- How do you balance sensitivity (catch all secrets) vs. specificity (minimize false positives)?
- What’s entropy analysis and how does it help?
- Resource: truffleHog source code, GitGuardian patterns
- Git History Traversal
- How do you iterate through all commits and their files?
- How do you access file content at each commit without checkout?
- How do you handle large repositories efficiently?
- Book Reference: “Pro Git” Ch. 10 — Chacon
- History Rewriting
- What’s the difference between filter-branch, filter-repo, and BFG?
- What are the consequences of rewriting pushed history?
- How do you coordinate history rewrites with a team?
- Book Reference: “Pro Git” Ch. 7.6 — Chacon
Questions to Guide Your Design
Before implementing, think through these:
- Detection Strategy
- Should you scan every file in every commit or be smarter?
- How do you handle binary files?
- How do you prioritize findings by severity?
- Pattern Database
- How do you organize patterns for different secret types?
- How do you let users add custom patterns?
- How do you handle false positives (UUIDs that look like tokens)?
- Remediation
- Should you just report, or also offer to clean up?
- How do you handle secrets that are in current HEAD vs. only in history?
- What warnings do you give about history rewriting?
Thinking Exercise
Create and Detect a Secret
Commit a secret and trace what happens:
git init secret-test && cd secret-test
echo "AWS_KEY=AKIAIOSFODNN7EXAMPLE" > config.env
git add . && git commit -m "Add config"
# "Delete" the secret
rm config.env
git add . && git commit -m "Remove config"
# Is it really gone?
git log --all --oneline
git show HEAD~1:config.env
git log -p --all -S "AKIAIOSFODNN7"
Questions while tracing:
- Can you still see the secret?
- What git commands reveal it?
- What would
git gcdo? - How would you truly remove it?
The Interview Questions They’ll Ask
Prepare to answer these:
- “How would you detect secrets in a Git repository’s history?”
- “What’s the difference between deleting a file and removing it from Git history?”
- “How would you handle a situation where a secret was pushed to a public repository?”
- “What’s entropy analysis and how does it help detect secrets?”
- “What are the risks of using git filter-branch on a shared repository?”
Hints in Layers
Hint 1: Starting Point
Use git log --all --pretty=format:"%H" -- "*" to get all commits, then git show <commit>:<path> to get file contents.
Hint 2: Pattern Matching
Create a pattern database with regexes like AKIA[0-9A-Z]{16} for AWS keys, ghp_[A-Za-z0-9]{36} for GitHub tokens.
Hint 3: Entropy Analysis Calculate Shannon entropy of strings. High entropy (> 4.5 bits/char) suggests randomness (keys, tokens). Low entropy is probably normal code.
Hint 4: History Cleaning
Use git filter-repo (successor to filter-branch) or BFG Repo-Cleaner. They’re faster and safer than raw filter-branch.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| History rewriting | “Pro Git” by Chacon | Ch. 7.6 |
| Pattern matching | “Mastering Regular Expressions” by Friedl | Ch. 4-5 |
| Security practices | “The DevOps Handbook” by Kim et al. | Ch. 20 |
Implementation Hints
Entropy calculation:
fn entropy(s: &str) -> f64 {
let mut freq = [0u32; 256];
for byte in s.bytes() {
freq[byte as usize] += 1;
}
let len = s.len() as f64;
freq.iter()
.filter(|&&count| count > 0)
.map(|&count| {
let p = count as f64 / len;
-p * p.log2()
})
.sum()
}
fn is_likely_secret(s: &str) -> bool {
s.len() >= 20 && entropy(s) > 4.5
}
Learning milestones:
- You can scan files for pattern matches → You understand secret detection
- You can traverse all history efficiently → You understand Git object traversal
- You can calculate entropy to reduce false positives → You understand advanced detection
- You can suggest or perform history cleanup → You understand remediation
Final Overall Project: Git Workflow Platform
- File: LEARN_ADVANCED_GIT_WORKFLOWS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, TypeScript
- Coolness Level: Level 5: Pure Magic
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 5: Master
- Knowledge Area: Full-Stack / Platform Engineering / Git Internals
- Software or Tool: Git, GitHub, PostgreSQL
- Main Book: Software Engineering at Google by Winters et al.
What you’ll build: A comprehensive Git workflow platform that combines multiple projects—a web UI for managing repositories, integrated trunk-based development tooling, automatic code review, monorepo support, and analytics—essentially a mini version of GitHub’s enterprise features or Graphite.
Why this is the capstone: This project synthesizes everything you’ve learned. You’ll integrate Git internals knowledge with workflow automation, build tools that work together, and create a platform that could genuinely improve a team’s development experience.
Core challenges you’ll face:
- Integrating multiple subsystems → maps to understanding system architecture
- Building a web interface for Git operations → maps to understanding Git over HTTP
- Handling concurrency and scale → maps to understanding production systems
- Providing a cohesive user experience → maps to understanding developer tooling
Key Concepts:
- Platform architecture: Software Engineering at Google Ch. 16-18 — Winters et al.
- Git HTTP protocol: Git documentation on smart HTTP
- Web application design: Designing Data-Intensive Applications — Kleppmann
Difficulty: Master Time estimate: 3-6 months Prerequisites: All previous projects completed
Real World Outcome
You’ll have a platform that makes Git workflows seamless:
┌─────────────────────────────────────────────────────────────────┐
│ Git Workflow Platform - Dashboard │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Repositories Quick Actions │
│ ┌─────────────────────────────┐ ┌─────────────────────┐ │
│ │ ● my-monorepo [healthy] │ │ New Repository │ │
│ │ ● api-service [2 PRs] │ │ Import from GitHub │ │
│ │ ● web-frontend [warning] │ │ Run Analytics │ │
│ └─────────────────────────────┘ └─────────────────────┘ │
│ │
│ Active Stacks │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ auth-refactor (3 PRs) │ │
│ │ PR #127 ✓ → PR #128 ⏳ → PR #129 ⏳ │ │
│ │ Next: Merge #127, auto-rebase stack │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ Trunk Health │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ ✓ All tests passing │ │
│ │ ✓ No stale branches (oldest: 1.2 days) │ │
│ │ ⚠️ 2 PRs awaiting review > 24h │ │
│ │ ❌ Secret detected in commit abc123 (alert sent) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ Recent Activity │
│ 10:32 alice merged PR #125 into main │
│ 10:28 Bot: Updated changelog for v2.3.0 │
│ 10:15 bob opened PR #130 "Add dark mode" │
│ 09:45 CI: All 847 tests passed on main │
│ │
└─────────────────────────────────────────────────────────────────┘
Features:
- Repository management with analytics
- Stacked PR management with automatic rebasing
- Trunk-based development enforcement
- Automated code review bot
- Secret scanning with alerts
- Conventional commit enforcement
- Automatic changelog generation
- Monorepo affected detection
Learning milestones:
- You can design a multi-service architecture → You understand platform design
- You can integrate Git operations with a web UI → You understand Git’s interfaces
- You can handle concurrent Git operations → You understand locking and transactions
- Your platform improves developer workflow → You’ve built something genuinely useful
Project Comparison Table
| # | Project | Difficulty | Time | Depth | Fun Factor |
|---|---|---|---|---|---|
| 1 | Git Object Explorer | Intermediate | Weekend | ★★★★☆ | ★★★☆☆ |
| 2 | Commit Graph Visualizer | Intermediate | 1-2 weeks | ★★★★★ | ★★★★☆ |
| 3 | Interactive Rebase Simulator | Advanced | 1-2 weeks | ★★★★★ | ★★★★☆ |
| 4 | Three-Way Merge Engine | Expert | 2-4 weeks | ★★★★★ | ★★★☆☆ |
| 5 | Git Hooks Framework | Intermediate | 1 week | ★★★☆☆ | ★★★☆☆ |
| 6 | Trunk-Based Dev Pipeline | Advanced | 2-3 weeks | ★★★★☆ | ★★★★☆ |
| 7 | Code Review Bot | Advanced | 2-3 weeks | ★★★☆☆ | ★★★★★ |
| 8 | Monorepo Task Runner | Expert | 1 month+ | ★★★★★ | ★★★★★ |
| 9 | Git Bisect Automator | Intermediate | 1 week | ★★★☆☆ | ★★★★☆ |
| 10 | Stacked PRs Manager | Advanced | 2-3 weeks | ★★★★★ | ★★★★★ |
| 11 | Conventional Commits Enforcer | Intermediate | 1 week | ★★★☆☆ | ★★★☆☆ |
| 12 | Git Worktree Manager | Intermediate | 1-2 weeks | ★★★☆☆ | ★★★★☆ |
| 13 | Repository Analytics | Advanced | 2-3 weeks | ★★★★☆ | ★★★★★ |
| 14 | Git Secret Scanner | Advanced | 2-3 weeks | ★★★★☆ | ★★★★☆ |
| 15 | Git Workflow Platform | Master | 3-6 months | ★★★★★ | ★★★★★ |
Recommendation
For beginners to Git internals: Start with Project 1 (Git Object Explorer), then Project 2 (Commit Graph Visualizer). These build foundational understanding of what Git actually stores and how history works.
For intermediate developers: Start with Project 5 (Git Hooks Framework) if you want immediate practical value, or Project 3 (Interactive Rebase Simulator) if you want to deepen your understanding of rebase.
For advanced developers: Go straight to Project 8 (Monorepo Task Runner) or Project 10 (Stacked PRs Manager). These are the most impactful for real-world workflows.
Recommended progression for maximum learning:
- Project 1 → 2 → 3 (Git internals foundation)
- Project 5 → 11 → 6 (Workflow automation)
- Project 7 → 13 → 14 (Code quality and security)
- Project 10 → 8 (Advanced workflow tooling)
- Final Project (Integration and mastery)
Summary
This learning path covers advanced Git workflows through 15 hands-on projects. Here’s the complete list:
| # | Project Name | Main Language | Difficulty | Time Estimate |
|---|---|---|---|---|
| 1 | Git Object Explorer | Python | Intermediate | Weekend |
| 2 | Commit Graph Visualizer | Python | Intermediate | 1-2 weeks |
| 3 | Interactive Rebase Simulator | Python | Advanced | 1-2 weeks |
| 4 | Three-Way Merge Engine | C | Expert | 2-4 weeks |
| 5 | Git Hooks Framework | Bash/Python | Intermediate | 1 week |
| 6 | Trunk-Based Development Pipeline | Python | Advanced | 2-3 weeks |
| 7 | Code Review Bot | Python | Advanced | 2-3 weeks |
| 8 | Monorepo Task Runner | Rust | Expert | 1 month+ |
| 9 | Git Bisect Automator | Python | Intermediate | 1 week |
| 10 | Stacked PRs Manager | Go | Advanced | 2-3 weeks |
| 11 | Conventional Commits Enforcer | Rust | Intermediate | 1 week |
| 12 | Git Worktree Manager | Go | Intermediate | 1-2 weeks |
| 13 | Repository Analytics Dashboard | Python | Advanced | 2-3 weeks |
| 14 | Git Secret Scanner | Rust | Advanced | 2-3 weeks |
| 15 | Git Workflow Platform (Capstone) | Go | Master | 3-6 months |
Recommended Learning Path
For beginners: Start with projects #1, #2, #5 For intermediate: Jump to projects #3, #6, #9, #11 For advanced: Focus on projects #8, #10, #14, #15
Expected Outcomes
After completing these projects, you will:
- Understand Git’s internal object model and how commits form a DAG
- Master rebase strategies and know when to use merge vs. rebase
- Implement trunk-based development with feature flags
- Build tools that automate code review and enforce quality
- Create monorepo tooling for affected detection and caching
- Detect and remediate security issues in Git history
- Design and build Git workflow platforms
You’ll have built 15 working projects that demonstrate deep understanding of Git workflows from first principles—from parsing objects to building enterprise-grade platforms.