Learn Git Mastery: From User to Git Guru

Goal: Master the usage of Git to a professional level. You will go beyond add/commit/push to understanding how to navigate complex history, rewrite mistakes, debug regressions automatically, and safely manage messy merge conflicts. You will become the person everyone calls when they “broke the repo.”

Why Git Mastery Matters

Git is the most widely used tool in software development, yet 90% of developers only use 10% of its features. When something goes wrong (detached HEAD, merge conflict, committed secrets), the average developer panics and deletes the folder to clone again.

Mastering Git means:

Fearlessness: You know that (almost) nothing is ever truly lost.
Precision: You craft clean commit history that tells a story, rather than a “wip”, “fix”, “fix again” mess.
Speed: You use bisect to find bugs in seconds that take hours to find manually.

Core Concept Analysis

1. The Mental Model: The Graph

Stop thinking of Git as a linear timeline. Think of it as a Graph of Snapshots.

      (main)
         ↓
A <─── B <─── C
              ↑
           (feature)

Commit: A snapshot of the entire project state (not just a diff).
Branch: A movable sticker (pointer) pointing to a specific commit.
HEAD: A “You Are Here” pointer attached to your current view.

2. The Three Areas (The Stage)

You must intuitively know where your changes are.

+----------------+      +----------------+      +----------------+
| Working Dir    | ---> |    Staging     | ---> |   Repository   |
| (Your Editor)  | add  |    (Index)     | commit|   (History)    |
+----------------+      +----------------+      +----------------+
        ^                       |
        |        checkout       |
        +-----------------------+

3. Movement vs. Change

Movement: checkout, switch. You move HEAD to a different commit. Files change to match that snapshot.
Change: reset. You move a Branch Pointer to a different commit. This changes history.

Git Object Model

Reference: “Pro Git” by Scott Chacon - Chapter 10: Git Internals

At its heart, Git is a content-addressable filesystem with a version control system built on top. Understanding this model transforms Git from a mysterious tool into a predictable system.

The Four Object Types

Git stores everything as objects in a key-value database. Every object has a type and a unique identifier (SHA-1 hash).

+------------------------------------------------------------------+
|                       GIT OBJECT TYPES                            |
+------------------------------------------------------------------+
|                                                                   |
|   BLOB (Binary Large OBject)                                      |
|   +------------------+                                            |
|   | File Content     |  <-- Stores ONLY the content, not the name |
|   | (compressed)     |      Two identical files = ONE blob        |
|   +------------------+                                            |
|   SHA: a906cb2a4a904...                                           |
|                                                                   |
|   TREE (Directory Snapshot)                                       |
|   +------------------+                                            |
|   | 100644 blob a906 |  <-- mode, type, hash, filename            |
|   | 100644 blob b042 |      Points to blobs (files)               |
|   | 040000 tree d8a3 |      or other trees (subdirectories)       |
|   +------------------+                                            |
|   SHA: d8a3e5f9c2b1...                                            |
|                                                                   |
|   COMMIT (Snapshot + Metadata)                                    |
|   +------------------+                                            |
|   | tree d8a3e5f9c2b |  <-- Points to root tree                   |
|   | parent a1b2c3d4e |  <-- Points to previous commit(s)          |
|   | author: Alice    |      (merges have multiple parents)        |
|   | message: "Fix"   |                                            |
|   +------------------+                                            |
|   SHA: e7f8a9b0c1d2...                                            |
|                                                                   |
|   TAG (Named Reference to Commit)                                 |
|   +------------------+                                            |
|   | object e7f8a9b0c |  <-- Points to a commit                    |
|   | type commit      |      (annotated tags are objects)          |
|   | tag v1.0.0       |                                            |
|   | tagger: Bob      |                                            |
|   +------------------+                                            |
|   SHA: f1e2d3c4b5a6...                                            |
|                                                                   |
+------------------------------------------------------------------+

Blobs: Content Without Identity

A blob stores file content, nothing more. It does not know:

Its filename
Its location in the directory structure
When it was created

                   TWO FILES, SAME CONTENT = ONE BLOB

     src/utils.js            tests/fixtures/utils.js
          |                           |
          |    (identical content)    |
          +----------+  +-------------+
                     |  |
                     v  v
               +------------+
               |   BLOB     |
               | a906cb2... |
               | "function  |
               |  add(a,b)  |
               |  {return   |
               |   a+b;}"   |
               +------------+

Key Insight: This is why Git is so efficient. If you have 100 copies of a file across branches, Git stores ONE blob.

Trees: The Directory Snapshot

A tree represents a directory at a specific moment. It contains pointers to blobs (files) and other trees (subdirectories).

                    TREE STRUCTURE EXAMPLE

     Root Tree (d8a3e5f...)
     +---------------------------------------------+
     | mode   | type | SHA-1      | name           |
     |--------|------|------------|----------------|
     | 100644 | blob | a906cb2... | README.md      |
     | 100644 | blob | b042f7d... | package.json   |
     | 040000 | tree | c5d8e9f... | src/           |
     +---------------------------------------------+
                                         |
                                         v
                              Subtree (c5d8e9f...)
                              +--------------------------------+
                              | 100644 | blob | d1e2f3... | index.js   |
                              | 100644 | blob | e4f5a6... | utils.js   |
                              +--------------------------------+

     File Modes:
       100644 = Regular file
       100755 = Executable file
       040000 = Subdirectory (tree)
       120000 = Symbolic link

Commits: Snapshots in Time

A commit ties everything together: it points to a root tree (the complete project snapshot) and to its parent commit(s).

                         COMMIT ANATOMY

     +-----------------------------------------------------------+
     | COMMIT e7f8a9b0c1d2e3f4a5b6c7d8e9f0...                    |
     +-----------------------------------------------------------+
     | tree      d8a3e5f9c2b1a0...   <-- Root directory snapshot |
     | parent    a1b2c3d4e5f6a7...   <-- Previous commit         |
     | author    Alice <alice@ex.com> 1703520000 +0000           |
     | committer Alice <alice@ex.com> 1703520000 +0000           |
     |                                                           |
     | Fix critical bug in authentication module                 |
     |                                                           |
     | The token validation was not checking expiration time,    |
     | allowing expired tokens to authenticate users.            |
     +-----------------------------------------------------------+

The SHA-1 Hash: Content Addressing

Every object’s ID is a SHA-1 hash of its content. This provides:

Integrity: Any corruption changes the hash
Deduplication: Identical content = identical hash
Immutability: Changing content creates a NEW object

               HOW SHA-1 HASHING WORKS

     Content: "Hello, Git!"
          |
          v
     +------------------+
     | SHA-1 Algorithm  |
     +------------------+
          |
          v
     Hash: 943a702d06f34599...


     Even ONE character change produces completely different hash:

     "Hello, Git!"  --> 943a702d06f34599...
     "Hello, Git?"  --> f7d2e8a91b3c4d5e...  (totally different!)


     The hash includes a header:

     +----------------------------------------------------+
     | "blob" + " " + content_length + "\0" + content     |
     +----------------------------------------------------+
          |
          v
     SHA-1 --> a906cb2a4a904a152e80877d...

Note: As discussed in “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron (Chapter 2), cryptographic hash functions like SHA-1 are designed to produce uniformly distributed outputs. This property ensures that even minor changes to input data result in drastically different hash values, making it virtually impossible for two different pieces of content to produce the same hash (collision resistance).

The Directed Acyclic Graph (DAG)

All Git objects form a DAG - a graph where:

Edges point backward in time (child -> parent)
No cycles exist (you cannot be your own ancestor)

                    THE GIT DAG

           Tags/Branches (lightweight pointers)
                   |
                   v

     [v1.0]     [main]        [feature]
        |          |              |
        v          v              v
     +----+     +----+         +----+
     | C1 |<----| C4 |<--------| C6 |
     +----+     +----+         +----+
        ^          ^              |
        |          |              |
     +----+     +----+         +----+
     | C0 |<----| C3 |<--------| C5 |
     +----+     +----+         +----+
                   ^
                   |
                +----+
                | C2 |  <-- Merge commit (two parents)
                +----+
                /    \
               /      \
           +----+  +----+
           | A1 |  | B1 |
           +----+  +----+
               \      /
                \    /
                +----+
                | A0 |  <-- Common ancestor
                +----+


     READING THE DAG:
     - Arrows point to PARENTS (backward in time)
     - Branches are just LABELS on commits
     - Tags are PERMANENT labels (usually)
     - Every commit except the root has at least one parent
     - Merge commits have multiple parents

Visualizing Objects with git cat-file

You can inspect any object:

# See object type
$ git cat-file -t e7f8a9b
commit

# See object content
$ git cat-file -p e7f8a9b
tree d8a3e5f9c2b1a0...
parent a1b2c3d4e5f6a7...
author Alice <alice@example.com> 1703520000 +0000
committer Alice <alice@example.com> 1703520000 +0000

Fix critical bug in authentication module

# Inspect a tree
$ git cat-file -p d8a3e5f
100644 blob a906cb2...    README.md
040000 tree c5d8e9f...    src

# Inspect a blob
$ git cat-file -p a906cb2
# (prints raw file content)

How Git Stores Data

Reference: “Pro Git” by Scott Chacon - Chapter 10.2: Git Objects Additional context: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron - Chapter 10: System-Level I/O

The .git Directory Structure

When you run git init, Git creates a .git directory that IS your repository. Everything else in your project folder is the “working tree.”

                    .git DIRECTORY ANATOMY

     my-project/
     +-- .git/                    <-- THE REPOSITORY
     |   +-- HEAD                 <-- Current branch pointer
     |   +-- config               <-- Repository configuration
     |   +-- description          <-- GitWeb description
     |   +-- hooks/               <-- Client/server-side scripts
     |   |   +-- pre-commit.sample
     |   |   +-- post-receive.sample
     |   |   +-- ...
     |   +-- info/
     |   |   +-- exclude          <-- Local gitignore (not shared)
     |   +-- objects/             <-- THE OBJECT DATABASE
     |   |   +-- pack/            <-- Packed objects (compressed)
     |   |   +-- info/            <-- Pack metadata
     |   |   +-- a9/              <-- Loose objects (first 2 chars of SHA)
     |   |   |   +-- 06cb2a4a904a152e80877d...
     |   |   +-- d8/
     |   |   |   +-- a3e5f9c2b1a0...
     |   |   +-- ...
     |   +-- refs/                <-- BRANCH/TAG POINTERS
     |   |   +-- heads/           <-- Local branches
     |   |   |   +-- main         <-- Contains: e7f8a9b0c1d2...
     |   |   |   +-- feature      <-- Contains: a1b2c3d4e5f6...
     |   |   +-- tags/            <-- Tag references
     |   |   |   +-- v1.0.0       <-- Contains: f1e2d3c4b5a6...
     |   |   +-- remotes/         <-- Remote-tracking branches
     |   |       +-- origin/
     |   |           +-- main     <-- Contains: c7d8e9f0a1b2...
     |   +-- index                <-- THE STAGING AREA (binary)
     |   +-- logs/                <-- Reflog data
     |       +-- HEAD
     |       +-- refs/
     |           +-- heads/
     |               +-- main
     +-- src/                     <-- YOUR WORKING TREE
     +-- README.md                    (not part of .git)
     +-- ...

Loose vs Packed Objects

Git stores objects in two ways:

Loose Objects: Each object is a single compressed file.

                    LOOSE OBJECT STORAGE

     When you commit, Git creates loose objects:

     .git/objects/
     +-- a9/
     |   +-- 06cb2a4a904a152e80877d4f2a5d...   <-- First 2 chars = dir
     |                                              Rest = filename
     +-- d8/
     |   +-- a3e5f9c2b1a0e7f8...
     +-- e7/
         +-- f8a9b0c1d2e3f4a5b6...


     Inside a loose object (simplified):

     +---------------------------------------------------+
     | ZLIB COMPRESSED                                    |
     | +-----------------------------------------------+ |
     | | "blob 15\0Hello, World!\n"                    | |
     | |  type  size  null-terminator  content        | |
     | +-----------------------------------------------+ |
     +---------------------------------------------------+

Packed Objects: Git periodically compresses loose objects into “pack files” for efficiency.

                    PACKED OBJECT STORAGE

     After running `git gc` or `git repack`:

     .git/objects/pack/
     +-- pack-a1b2c3d4e5f6...idx   <-- INDEX: Fast lookup table
     +-- pack-a1b2c3d4e5f6...pack  <-- DATA: All objects compressed


     Pack File Structure:

     +----------------------------------------------------------+
     | HEADER (12 bytes)                                         |
     | "PACK" | version | object_count                           |
     +----------------------------------------------------------+
     | OBJECT 1                                                  |
     | +------------------------------------------------------+ |
     | | type | size | [base_ref] | delta_or_content          | |
     | +------------------------------------------------------+ |
     +----------------------------------------------------------+
     | OBJECT 2 (delta-compressed against Object 1)             |
     | +------------------------------------------------------+ |
     | | OBJ_REF_DELTA | base_SHA | delta_instructions        | |
     | +------------------------------------------------------+ |
     +----------------------------------------------------------+
     | ... more objects ...                                      |
     +----------------------------------------------------------+
     | SHA-1 CHECKSUM (20 bytes)                                 |
     +----------------------------------------------------------+


     DELTA COMPRESSION EXAMPLE:

     version_1.txt (1000 bytes):     "function add(a, b) { ... }"
     version_2.txt (1005 bytes):     "function add(a, b, c) { ... }"

     Instead of storing 2005 bytes, Git stores:
     - Base: version_1.txt (1000 bytes, compressed)
     - Delta: "at offset 17, insert ', c'" (~30 bytes)

     Total: ~1030 bytes instead of 2005 bytes (48% savings)

As explained in “Computer Systems: A Programmer’s Perspective” (Chapter 10), file I/O operations at the system level involve buffering and caching. Git leverages memory-mapped files and efficient I/O patterns to minimize disk access when reading objects, especially from pack files.

The Index (Staging Area)

The index is a binary file (.git/index) that represents your next commit. It is NOT just a list of staged files; it is a complete snapshot.

                    THE INDEX FILE

     +----------------------------------------------------------------+
     | .git/index (Binary Format)                                      |
     +----------------------------------------------------------------+
     | HEADER                                                          |
     | +------------------------------------------------------------+ |
     | | "DIRC" | version (2/3/4) | entry_count                     | |
     | +------------------------------------------------------------+ |
     +----------------------------------------------------------------+
     | ENTRIES (one per tracked file)                                  |
     | +------------------------------------------------------------+ |
     | | ctime | mtime | dev | ino | mode | uid | gid | size |      | |
     | | SHA-1 of blob | flags | path                                | |
     | +------------------------------------------------------------+ |
     | | Entry 2...                                                  | |
     | | Entry 3...                                                  | |
     +----------------------------------------------------------------+
     | EXTENSIONS (optional)                                           |
     | - TREE: cached tree objects                                     |
     | - REUC: resolve undo (for conflicts)                            |
     +----------------------------------------------------------------+
     | SHA-1 CHECKSUM                                                  |
     +----------------------------------------------------------------+


     THE THREE-TREE ARCHITECTURE:

     +----------------+      +----------------+      +----------------+
     |   HEAD (Repo)  |      |     INDEX      |      |  Working Dir   |
     +----------------+      +----------------+      +----------------+
     | README.md v1   |      | README.md v2   |      | README.md v3   |
     | src/index.js   |      | src/index.js   |      | src/index.js   |
     |                |      | NEW: config.js |      | NEW: config.js |
     +----------------+      +----------------+      +----------------+
           |                        |                       |
           |   git diff --cached    |     git diff          |
           +------------------------+                       |
           |                                                |
           +------------------------------------------------+
                           git diff HEAD


     Commands that modify the Index:

     git add file.txt       --> Copies working dir version TO index
     git reset file.txt     --> Copies HEAD version TO index
     git checkout file.txt  --> Copies index version TO working dir

How a Commit is Created

                    COMMIT CREATION FLOW

     STEP 1: You run "git add file.txt"
     +--------------------------------------------------------------------+
     |                                                                     |
     |  Working Dir                    .git/objects/                       |
     |  +------------+                 +-------------------+               |
     |  | file.txt   |  --HASH--->    | a9/06cb2a4a...   |  (new blob)   |
     |  | "content"  |                 | (zlib compressed) |               |
     |  +------------+                 +-------------------+               |
     |                                                                     |
     |  .git/index is updated:                                             |
     |  +----------------------------------+                               |
     |  | file.txt -> SHA: a906cb2a4a...  |                               |
     |  +----------------------------------+                               |
     |                                                                     |
     +--------------------------------------------------------------------+

     STEP 2: You run "git commit -m 'Add file'"
     +--------------------------------------------------------------------+
     |                                                                     |
     |  Git reads the index and creates TREE objects:                      |
     |                                                                     |
     |  .git/objects/                                                      |
     |  +-------------------+                                              |
     |  | d8/a3e5f9c2...   |  (new tree - root directory)                 |
     |  | blob a906 file.txt                                               |
     |  +-------------------+                                              |
     |                                                                     |
     |  Git creates a COMMIT object:                                       |
     |  +-------------------+                                              |
     |  | e7/f8a9b0c1...   |  (new commit)                                |
     |  | tree d8a3e5f9c2  |                                               |
     |  | parent <prev>    |                                               |
     |  | author Alice...  |                                               |
     |  | "Add file"       |                                               |
     |  +-------------------+                                              |
     |                                                                     |
     |  Git updates the branch pointer:                                    |
     |  +------------------------------+                                   |
     |  | .git/refs/heads/main         |                                   |
     |  | e7f8a9b0c1d2e3f4...          |                                   |
     |  +------------------------------+                                   |
     |                                                                     |
     +--------------------------------------------------------------------+

Remote Tracking Explained

Reference: “Pro Git” by Scott Chacon - Chapter 3.5: Remote Branches

What Are Remote-Tracking Branches?

A remote-tracking branch is a local pointer that remembers where a branch was on a remote repository the last time you communicated with it.

                    LOCAL vs REMOTE-TRACKING BRANCHES

     .git/refs/
     +-- heads/                    <-- YOUR local branches
     |   +-- main                  e7f8a9b0... (you control this)
     |   +-- feature               a1b2c3d4... (you control this)
     |
     +-- remotes/                  <-- REMOTE-TRACKING branches
         +-- origin/               (read-only snapshots)
             +-- main              c7d8e9f0... (updated by fetch)
             +-- feature           b5c6d7e8... (updated by fetch)


     KEY INSIGHT:

     "main"          = Your local branch. YOU move it.
     "origin/main"   = A bookmark. Git moves it when you fetch/push.

     You CANNOT checkout origin/main and commit to it directly.
     It is a snapshot of "where was main on origin last time I checked?"

The Mental Model

                    YOUR COMPUTER vs THE SERVER

     +---------------------------+     +---------------------------+
     |      YOUR COMPUTER        |     |     GITHUB (origin)       |
     +---------------------------+     +---------------------------+
     |                           |     |                           |
     |  Working Directory        |     |                           |
     |  +-------------------+    |     |                           |
     |  | (your files)      |    |     |                           |
     |  +-------------------+    |     |                           |
     |                           |     |                           |
     |  Local Branches:          |     |  Remote Branches:         |
     |  +-------------------+    |     |  +-------------------+    |
     |  | main     -> C5    |    |     |  | main     -> C4    |    |
     |  | feature  -> C7    |    |     |  | feature  -> C6    |    |
     |  +-------------------+    |     |  +-------------------+    |
     |                           |     |                           |
     |  Remote-Tracking:         |     |                           |
     |  +-------------------+    |     |                           |
     |  | origin/main -> C4 |<---|-----|-- (snapshot from fetch)   |
     |  | origin/feat -> C6 |<---|-----|-- (snapshot from fetch)   |
     |  +-------------------+    |     |                           |
     |                           |     |                           |
     +---------------------------+     +---------------------------+


     YOUR WORKFLOW:

     1. You commit locally      : main goes from C4 -> C5
     2. origin/main stays at C4 : It doesn't know about C5 yet
     3. Someone pushes to GitHub: GitHub's main goes to C8
     4. You git fetch           : origin/main updates to C8
     5. Your main is still C5   : You are now "behind"

Fetch, Pull, and Push Explained

git fetch: Download objects and update remote-tracking branches. Does NOT touch your local branches.

                    GIT FETCH

     BEFORE FETCH:

     Your Computer                           GitHub
     +------------------+                    +------------------+
     | main     -> C3   |                    | main     -> C5   |
     | origin/main -> C3|                    |                  |
     +------------------+                    +------------------+

     $ git fetch origin

     AFTER FETCH:

     Your Computer                           GitHub
     +------------------+                    +------------------+
     | main     -> C3   |  (unchanged!)      | main     -> C5   |
     | origin/main -> C5|  (updated!)        |                  |
     +------------------+                    +------------------+

     New objects (C4, C5) are now in your .git/objects/
     But your working directory is unchanged.


     VISUAL:

     Before:  A---B---C (main, origin/main)

     After:   A---B---C (main)
                      \
                       D---E (origin/main)

git pull: Fetch + Merge (or Rebase).

                    GIT PULL (fetch + merge)

     BEFORE PULL:

     Local:    A---B---C (main)
                       \
     origin:            D---E (origin/main after fetch)

     $ git pull origin main
     # Equivalent to: git fetch origin && git merge origin/main

     AFTER PULL:

     Local:    A---B---C-------F (main)  <-- merge commit
                       \     /
                        D---E (origin/main)


     GIT PULL --REBASE:

     BEFORE:

     Local:    A---B---C (main)
     origin:        \
                     D---E (origin/main)

     $ git pull --rebase origin main

     AFTER:

     Local:    A---B---D---E---C' (main)  <-- your commit replayed
                           ^
                           (origin/main)

     NOTE: C became C' because its parent changed (new SHA-1).

git push: Upload your commits and update remote branches.

                    GIT PUSH

     BEFORE PUSH:

     Your Computer                           GitHub
     +------------------+                    +------------------+
     | main     -> C5   |                    | main     -> C3   |
     | origin/main -> C3|                    |                  |
     +------------------+                    +------------------+

     Local:    A---B---C---D---E (main)
                       ^
                       (origin/main)

     $ git push origin main

     AFTER PUSH:

     Your Computer                           GitHub
     +------------------+                    +------------------+
     | main     -> C5   |                    | main     -> C5   |
     | origin/main -> C5|  (updated!)        |                  |
     +------------------+                    +------------------+

     Local:    A---B---C---D---E (main, origin/main)


     PUSH REJECTED SCENARIO:

     Your Computer                           GitHub
     +------------------+                    +------------------+
     | main     -> F    |                    | main     -> E    |
     | origin/main -> C |  (stale!)          |                  |
     +------------------+                    +------------------+

     Local:    A---B---C---F (main)
                       \
     GitHub:            D---E (main on remote)

     $ git push origin main
     ERROR: rejected (non-fast-forward)

     Solution 1: git pull (creates merge commit)
     Solution 2: git pull --rebase (replay your commits)
     Solution 3: git push --force (DANGEROUS! Overwrites remote history)

Tracking Relationships

                    BRANCH TRACKING

     When you run: git checkout -b feature origin/feature

     Git creates:
     1. Local branch "feature" pointing to same commit as origin/feature
     2. Tracking relationship: feature -> origin/feature

     .git/config:
     +------------------------------------------+
     | [branch "feature"]                        |
     |     remote = origin                       |
     |     merge = refs/heads/feature            |
     +------------------------------------------+


     EFFECTS OF TRACKING:

     $ git status
     On branch feature
     Your branch is ahead of 'origin/feature' by 2 commits.

     $ git pull          # Knows to pull from origin/feature
     $ git push          # Knows to push to origin/feature


     SETTING UP TRACKING:

     # When creating a branch:
     git checkout -b feature origin/feature        # Automatic
     git checkout --track origin/feature           # Shorthand

     # For existing branch:
     git branch --set-upstream-to=origin/feature feature
     git branch -u origin/feature                  # Shorthand

     # View tracking:
     git branch -vv
     * main    e7f8a9b [origin/main: ahead 2] Latest commit
       feature a1b2c3d [origin/feature: behind 1] Feature work

The Complete Picture

                    PUTTING IT ALL TOGETHER

     +====================================================================+
     ||                        YOUR LOCAL REPOSITORY                     ||
     +====================================================================+
     |                                                                     |
     |  .git/refs/heads/          .git/refs/remotes/origin/               |
     |  (Local Branches)          (Remote-Tracking Branches)               |
     |                                                                     |
     |       main ----+                origin/main ----+                   |
     |                |                                |                   |
     |                v                                v                   |
     |                                                                     |
     |                         COMMIT GRAPH                                |
     |                                                                     |
     |                              +---+                                  |
     |                              |C6 | (main)                           |
     |                              +---+                                  |
     |                                |                                    |
     |                              +---+                                  |
     |                              |C5 |                                  |
     |                              +---+                                  |
     |                                |                                    |
     |      +---+                   +---+                                  |
     |      |C4 | (origin/main)    |C3 |                                  |
     |      +---+                   +---+                                  |
     |        |                       |                                    |
     |        +-----------+-----------+                                    |
     |                    |                                                |
     |                  +---+                                              |
     |                  |C2 |                                              |
     |                  +---+                                              |
     |                    |                                                |
     |                  +---+                                              |
     |                  |C1 |                                              |
     |                  +---+                                              |
     |                                                                     |
     +=====================================================================+

     Reading this graph:
     - Your local "main" is at C6 (you made commits C5, C6)
     - "origin/main" is at C4 (last time you fetched, remote was at C4)
     - You are 2 commits ahead of origin/main
     - If someone pushed C7 to GitHub, you won't see it until you fetch

Concept Summary Table

Concept Cluster	What You Must Internalize
Navigation	`checkout`/`switch` moves YOU (HEAD). `reset` moves the BRANCH reference.
History Rewriting	`rebase` is copy-pasting commits to a new base. It changes commit hashes.
The Reflog	Your safety net. It records every movement of HEAD, even “lost” commits.
The Stage	You can craft commits precisely using `add -p` (patch mode).
Bisect	Binary search for bugs. The most underused superpower in Git.

Deep Dive Reading by Concept

Concept 1: The Basics & History

Concept 2: Branching & Merging

Concept 3: Tools & Debugging

Concept 4: Undo Operations & Recovery

Concept 5: Collaboration Workflows

Concept 6: Git Hooks & Automation

Concept 7: Security in Git

| Concept | Book & Chapter | |———|—————-| | Signing Commits (GPG) | “Pro Git” by Scott Chacon — Ch. 7.4: “Signing Your Work” | | Credential Management | “Pro Git” by Scott Chacon — Ch. 7.14: “Credential Storage” | | Git Internals (understanding the object model) | “Pro Git” by Scott Chacon — Ch. 10: “Git Internals” | | Offensive Git Techniques | “Black Hat Bash” by Nick Aleks — Ch. 7: “Persistence and Privilege Escalation” (understanding attack vectors) | | Secrets in Shell Scripts | “Black Hat Bash” by Nick Aleks — Ch. 3: “Reconnaissance” (what attackers look for in repos) | | Secure Shell Practices | “Effective Shell” by Dave Kerr — Ch. 20: “Security” | | Protecting Sensitive Data | “The Pragmatic Programmer” by Hunt & Thomas — Topic 26: “Decoupling” (separating secrets from code) |

Essential Reading Order

For maximum comprehension, follow this reading sequence:

Phase 1: Foundation (Before Starting Projects)

“The Linux Command Line” — Part 1: Learning the Shell (essential terminal fluency)
“Pro Git” — Ch. 1-2: Getting Started & Git Basics (core mental model)
“Effective Shell” — Ch. 1-3: Getting Started (efficient command line usage)

Phase 2: Core Git Mastery (During Projects 1-3)

“Pro Git” — Ch. 3: Git Branching (the heart of Git)
“Pro Git” — Ch. 7: Git Tools (reset, reflog, bisect, stash)
“The Pragmatic Programmer” — Topic 19: Version Control (philosophy and best practices)

Phase 3: Collaboration & Workflows (During Projects 4-5)

“Pro Git” — Ch. 5: Distributed Git (team workflows)
“Clean Code” — Ch. 1 & 17: Clean Code principles (what makes commits worth reviewing)
“Pro Git” — Ch. 8: Customizing Git (hooks, aliases, configuration)

Phase 4: Advanced & Security (Post-Projects Deep Dive)

“Pro Git” — Ch. 10: Git Internals (understand the object database)
“Wicked Cool Shell Scripts” — Ch. 1 & 6: Automation patterns for hooks
“Black Hat Bash” — Ch. 3 & 7: Security awareness (what attackers look for in Git repos)
“Effective Shell” — Ch. 20: Security (protecting your Git workflow)

Quick Reference During Any Project:

“The Linux Command Line” — Ch. 20: Text Processing (parsing git log output)
“Wicked Cool Shell Scripts” — Any chapter for automation ideas
“Pro Git” — Ch. 7.7: Reset Demystified (when confused about HEAD/Index/Working Dir)

Project List

These projects are designed to be “Simulation Scenarios.” You will create a repo, create a specific mess or situation, and then resolve it.

Project 1: “The Precision Surgeon” — Master Staging and Committing

Attribute	Value
File	GIT_MASTERY_LEARNING_PROJECTS.md
Main Tool	Git CLI
Difficulty	Beginner
Knowledge Area	Git Workflow
Main Book	“Pro Git” Chapter 2

What you’ll build: A repository where you simulate “messy coding” (changing 3 different features in one file) and then use granular staging to create 3 distinct, clean atomic commits.

Why it teaches Git: Most beginners type git add . and commit everything. This teaches you the Index (Staging Area) as a tool for crafting history, not just a hurdle before committing.

Core challenges you’ll face:

Partial Staging: How to stage only lines 10-15 of a file while ignoring lines 20-30.
Atomic Commits: Ensuring that if you revert commit #2, commit #1 and #3 still work.
Diffing Staged vs Unstaged: Seeing exactly what you are about to commit vs what you are leaving behind.

Key Concepts:

Interactive Staging: git add -p (patch mode)
Diffing: git diff vs git diff --staged
Atomic Commits: The philosophy of “one logical change per commit”

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic git init, add, commit knowledge

Real World Outcome

You will have a simulated project history that looks perfect, even though your working style was chaotic.

Full Interactive Session:

# STEP 1: You made 3 changes in 'app.py' at the same time
$ git status
On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   app.py

no changes added to commit (use "git add" and/or "git commit -a")

# STEP 2: See what you changed
$ git diff app.py
diff --git a/app.py b/app.py
index 8a1d2e3..4f5c6d7 100644
--- a/app.py
+++ b/app.py
@@ -1,5 +1,5 @@
 def calculate_total(items):
-    total = 0
+    total = 0.0  # FIX: Use float for precision
     for item in items:
         total += item.price
     return total
@@ -10,8 +10,8 @@ def format_currency(amount):
     return f"${amount:.2f}"

 def main():
-for i in range(10):
-process(i)
+    for i in range(10):
+        process(i)

+# Add new login feature
+def user_login(username, password):
+    """Authenticate user credentials."""
+    return authenticate(username, password)

# STEP 3: Begin interactive staging
$ git add -p app.py
diff --git a/app.py b/app.py
index 8a1d2e3..4f5c6d7 100644
--- a/app.py
+++ b/app.py
@@ -1,5 +1,5 @@
 def calculate_total(items):
-    total = 0
+    total = 0.0  # FIX: Use float for precision
     for item in items:
         total += item.price
     return total
(1/3) Stage this hunk [y,n,q,a,d,j,J,g,/,e,?]? y

# Git moves to the next hunk (the indentation fix)
@@ -10,8 +10,8 @@ def format_currency(amount):
     return f"${amount:.2f}"

 def main():
-for i in range(10):
-process(i)
+    for i in range(10):
+        process(i)
(2/3) Stage this hunk [y,n,q,a,d,K,j,J,g,/,e,?]? n

# Git moves to the next hunk (the new feature)
+# Add new login feature
+def user_login(username, password):
+    """Authenticate user credentials."""
+    return authenticate(username, password)
(3/3) Stage this hunk [y,n,q,a,d,K,g,/,e,?]? n

# STEP 4: Verify what is staged vs unstaged
$ git diff --staged
diff --git a/app.py b/app.py
index 8a1d2e3..9b2c3d4 100644
--- a/app.py
+++ b/app.py
@@ -1,5 +1,5 @@
 def calculate_total(items):
-    total = 0
+    total = 0.0  # FIX: Use float for precision
     for item in items:
         total += item.price
     return total

$ git diff
# Shows the OTHER two changes still waiting (indent fix + new feature)

# STEP 5: Commit ONLY the bugfix
$ git commit -m "Fix: Use float for precision in calculate_total"
[main d4e5f6g] Fix: Use float for precision in calculate_total
 1 file changed, 1 insertion(+), 1 deletion(-)

# STEP 6: Repeat for the next logical change (indentation)
$ git add -p app.py
# Stage ONLY the indentation hunk this time
$ git commit -m "Style: Fix indentation in main loop"
[main h7i8j9k] Style: Fix indentation in main loop
 1 file changed, 2 insertions(+), 2 deletions(-)

# STEP 7: Finally, add the feature
$ git add -p app.py
# Stage the new function
$ git commit -m "Feat: Add user login function"
[main a1b2c3d] Feat: Add user login function
 1 file changed, 4 insertions(+)

# STEP 8: View your clean history
$ git log --oneline
a1b2c3d (HEAD -> main) Feat: Add user login function
h7i8j9k Style: Fix indentation in main loop
d4e5f6g Fix: Use float for precision in calculate_total
b3c4d5e Initial commit

# Instead of one "WIP updates" commit, you have 3 clean, revertible commits.
# Each commit can be reverted independently without affecting the others.

The Core Question You’re Answering

“How do I separate my messy ‘thinking process’ from the clean ‘project history’?”

Before you write any code, sit with this question. Your working directory is your messy desk. The repository history is the polished report you hand to your boss. The “Index” is where you organize the papers before stapling them.

Concepts You Must Understand First

The Staging Area (Index)
- What happens to a file when you git add it? Does it move? Or is it copied?
- Book Reference: “Pro Git” Ch. 2.2
Hunks
- What is a “hunk” in a diff?
- Book Reference: “Pro Git” Ch. 7.2 (Interactive Staging)
The Object Database
- Git stores content in a content-addressable filesystem. Every piece of content is stored as a blob identified by its SHA-1 hash.
- When you stage a file, Git immediately creates a blob object in .git/objects/ containing the file’s content.
- This means staging is NOT just “marking” a file - it’s actually creating a snapshot of the file at that moment.
- Book Reference: “Pro Git” Ch. 10.2 “Git Objects”
How Blobs Are Created When Staging
- Run git add file.txt and Git computes SHA-1 of the content, creates a compressed blob in .git/objects/.
- The Index (.git/index) stores the mapping: filename -> blob SHA-1 -> mode/permissions.
- You can verify this: ```bash
  Before staging
  
  $ find .git/objects -type f | wc -l 42
Stage a file

$ git add myfile.txt

After staging - one more object!

$ find .git/objects -type f | wc -l 43

See what’s in the index

$ git ls-files –stage 100644 a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0 0 myfile.txt ```
- Book Reference: “Pro Git” Ch. 10.2 “Git Objects” & Ch. 10.3 “Git References”
The Three Trees Model
- Git manages three “trees” (snapshots):
  - HEAD: The last commit snapshot (what you committed)
  - Index: The next commit snapshot (what you’re staging)
  - Working Directory: Your sandbox (what you see in your editor)
- git add copies from Working Directory -> Index
- git commit copies from Index -> HEAD (creates new commit)
- git checkout copies from HEAD -> Working Directory (and updates Index)
- Book Reference: “Pro Git” Ch. 7.7 “Reset Demystified” - This chapter brilliantly explains the three trees

Questions to Guide Your Design

Scenario Setup: Create a single file with 3 distinct “logic blocks” (e.g., a math function, a string formatter, and a main loop).
The Mess: Edit ALL three blocks at once. Introduce a bug fix in one, a variable rename in another, and a new comment in the third.
The Constraint: You are NOT allowed to use git add . or git add filename. You must use git add -p.

Thinking Exercise

Trace the Staging Area:

Working Dir: Version C (Messy)
Index:       Version A (Clean)
Repo:        Version A (Clean)

When you stage the first hunk:

Working Dir: Version C (Messy)
Index:       Version B (Partial Change)
Repo:        Version A (Clean)

Draw this mental model. The Index is a snapshot builder.

The Interview Questions They’ll Ask

“What is the difference between git diff and git diff --cached (or --staged)?”

Expected Answer:
- git diff shows changes between Working Directory and Index (unstaged changes)
- git diff --staged (or --cached) shows changes between Index and HEAD (what will be committed)
- To see ALL changes: git diff HEAD

“I accidentally added a file to staging. How do I unstage it without losing my changes?”

Expected Answer:

# Modern Git (2.23+)
$ git restore --staged filename.txt

# Classic Git
$ git reset HEAD filename.txt

# Both leave your working directory changes intact

“Why is git add . generally considered bad practice in large teams?”

Expected Answer:
- It stages EVERYTHING, including unintended files (debug logs, IDE settings, secrets)
- It makes atomic commits harder - you might commit unrelated changes together
- It bypasses the opportunity to review what you’re about to commit
- Better practice: git add -p for precision, or at minimum git add -u (only tracked files)
“What does git add -p do and what are the key options?”

Expected Answer: Interactive staging that lets you stage individual hunks. Key options:
- y - stage this hunk
- n - do not stage this hunk
- s - split the hunk into smaller hunks
- e - manually edit the hunk
- q - quit (don’t stage this hunk or any remaining)
- a - stage this hunk and all remaining hunks in the file
- d - do not stage this hunk or any remaining hunks in the file

“How can you see exactly what is in the staging area (Index)?”

Expected Answer:

# See staged file names
$ git diff --staged --name-only

# See staged content
$ git diff --staged

# See the raw index with blob SHAs
$ git ls-files --stage

# See a summary of staged vs unstaged
$ git status -s
# M  = staged modification
#  M = unstaged modification
# MM = both staged and unstaged modifications

“What happens internally when you run git add?”

Expected Answer:
1. Git computes the SHA-1 hash of the file content
2. Git compresses the content and stores it as a blob in .git/objects/
3. Git updates the Index (.git/index) with the filename, blob SHA, and file mode
4. The file is now “staged” - the blob exists and the index points to it

“How do you stage only part of a file when the changes are too close together for git add -p to split them?”

Expected Answer: Use the e (edit) option in git add -p to manually edit the hunk:

$ git add -p
# When prompted, press 'e' to edit
# In the editor, delete lines you DON'T want to stage (the + lines)
# Or replace + with space to keep the line but not stage it
# Save and exit

“What is the relationship between the Index and a commit?”

Expected Answer: The Index IS essentially the next commit’s tree, pre-assembled. When you run git commit:
1. Git creates a tree object from the current Index
2. Git creates a commit object pointing to that tree
3. Git moves HEAD (and the current branch) to the new commit
4. The Index remains unchanged (still matching the new commit)

Hints in Layers

Hint 1: Setup Create a Python or JS file. Commit a base version. Then open it and make 3 unrelated changes in different parts of the file.

Hint 2: The Command Run git add -p. Git will ask Stage this hunk [y,n,q,a,d,/,s,e]?.

Hint 3: Splitting Hunks If Git groups two changes into one “hunk” but you want to separate them, use the s (split) option.

Hint 4: Review Use git diff --staged to verify ONLY the bug fix is ready to commit.

Hint 5: Common Gotcha - Changes Too Close Together If your changes are within 3 lines of each other, Git will group them into ONE hunk. Solutions:

# Option 1: Try to split (may not work if changes are too close)
$ git add -p
(1/1) Stage this hunk [y,n,q,a,d,s,e,?]? s
# "Sorry, cannot split this hunk"

# Option 2: Manually edit the hunk
$ git add -p
(1/1) Stage this hunk [y,n,q,a,d,s,e,?]? e
# In the editor, delete the lines you don't want to stage

Hint 6: Common Gotcha - Accidentally Staged the Wrong Hunk If you staged something by mistake, you can unstage specific hunks:

# Unstage interactively (the opposite of git add -p)
$ git reset -p
# This lets you "unstage" hunks one by one

# Or unstage the entire file and start over
$ git restore --staged filename.txt
$ git add -p filename.txt

Books That Will Help

Topic	Book	Chapter
Recording Changes	“Pro Git”	Ch. 2.2
Interactive Staging	“Pro Git”	Ch. 7.2
Git Objects (Blobs, Trees)	“Pro Git”	Ch. 10.2
Reset Demystified (Three Trees)	“Pro Git”	Ch. 7.7

Attribute	Value
File	GIT_MASTERY_LEARNING_PROJECTS.md
Main Tool	Git CLI
Difficulty	Intermediate
Knowledge Area	Git Safety Nets
Main Book	“Pro Git” Chapter 7

What you’ll build: A repository where you intentionally “lose” work (delete a branch, hard reset a commit) and then use reflog and detached HEAD states to recover it completely.

Why it teaches Git: Fear holds users back. Once you know that Git records every commit you make (even the abandoned ones) in the reflog, you become fearless. You will also master “Detached HEAD” mode, understanding it’s just a state of “exploring without a branch label.”

Core challenges you’ll face:

Recovering a deleted branch: You deleted feature-x but realized you needed it.
Undoing a hard reset: You ran git reset --hard HEAD~1 and lost code. How to get it back?
Navigating Detached HEAD: Checking out a specific commit hash to look around without creating a branch.

Key Concepts:

The Reflog: The local journal of where HEAD has been.
HEAD vs Branch Pointers: Understanding the difference.
Object ID (SHA-1): Finding commits by hash.

Difficulty: Intermediate Time estimate: Weekend Prerequisites: Project 1

Real World Outcome

You will perform a “magic trick”: delete a branch with unique code, prove it’s gone, and then resurrect it instantly.

Full Interactive Session - Branch Recovery:

# STEP 1: Setup - Create a feature branch with valuable work
$ git checkout -b valuable-feature
Switched to a new branch 'valuable-feature'

$ echo "Critical algorithm implementation" > algorithm.py
$ git add algorithm.py
$ git commit -m "Add critical algorithm implementation"
[valuable-feature a1b2c3d] Add critical algorithm implementation
 1 file changed, 1 insertion(+)
 create mode 100644 algorithm.py

$ echo "Performance optimization" >> algorithm.py
$ git commit -am "Optimize algorithm performance"
[valuable-feature b2c3d4e] Optimize algorithm performance
 1 file changed, 1 insertion(+)

# STEP 2: Switch away and DELETE the branch
$ git checkout main
Switched to branch 'main'

$ git branch -D valuable-feature
Deleted branch valuable-feature (was b2c3d4e).

# STEP 3: OH NO! I needed that! Verify it's gone from git log
$ git log --oneline --all
8f7e6d5 (HEAD -> main) Initial commit
# The valuable-feature commits are NOT shown!

$ git branch -a
* main
# No valuable-feature branch!

$ ls algorithm.py
ls: algorithm.py: No such file or directory
# The file is GONE from the working directory!

# STEP 4: Check the reflog - our safety net
$ git reflog
8f7e6d5 (HEAD -> main) HEAD@{0}: checkout: moving from valuable-feature to main
b2c3d4e HEAD@{1}: commit: Optimize algorithm performance
a1b2c3d HEAD@{2}: commit: Add critical algorithm implementation
8f7e6d5 (HEAD -> main) HEAD@{3}: checkout: moving from main to valuable-feature
8f7e6d5 (HEAD -> main) HEAD@{4}: commit (initial): Initial commit

# With timestamps (even more useful!)
$ git reflog --date=relative
8f7e6d5 (HEAD -> main) HEAD@{5 minutes ago}: checkout: moving from valuable-feature to main
b2c3d4e HEAD@{6 minutes ago}: commit: Optimize algorithm performance
a1b2c3d HEAD@{7 minutes ago}: commit: Add critical algorithm implementation
8f7e6d5 (HEAD -> main) HEAD@{8 minutes ago}: checkout: moving from main to valuable-feature
8f7e6d5 (HEAD -> main) HEAD@{10 minutes ago}: commit (initial): Initial commit

# STEP 5: RESURRECTION! Create branch pointing to the "lost" commit
$ git branch valuable-feature b2c3d4e
# Or use reflog syntax:
$ git branch valuable-feature HEAD@{1}

$ git checkout valuable-feature
Switched to branch 'valuable-feature'

$ ls algorithm.py
algorithm.py
# The file is BACK!

$ cat algorithm.py
Critical algorithm implementation
Performance optimization
# All our work is recovered!

$ git log --oneline
b2c3d4e (HEAD -> valuable-feature) Optimize algorithm performance
a1b2c3d Add critical algorithm implementation
8f7e6d5 (main) Initial commit
# Full history restored!

Full Interactive Session - Hard Reset Recovery:

# STEP 1: Accidentally reset and "lose" commits
$ git log --oneline
c3d4e5f (HEAD -> main) Feature C - very important
b2c3d4e Feature B - also important
a1b2c3d Feature A
9f8e7d6 Initial commit

# Oops! I meant to reset to Feature B, but I typed the wrong commit!
$ git reset --hard 9f8e7d6
HEAD is now at 9f8e7d6 Initial commit

# All three features are GONE from git log
$ git log --oneline
9f8e7d6 (HEAD -> main) Initial commit

# STEP 2: Don't panic! Check reflog
$ git reflog
9f8e7d6 (HEAD -> main) HEAD@{0}: reset: moving to 9f8e7d6
c3d4e5f HEAD@{1}: commit: Feature C - very important
b2c3d4e HEAD@{2}: commit: Feature B - also important
a1b2c3d HEAD@{3}: commit: Feature A
9f8e7d6 (HEAD -> main) HEAD@{4}: commit (initial): Initial commit

# STEP 3: Recovery - reset back to where we were
$ git reset --hard c3d4e5f
HEAD is now at c3d4e5f Feature C - very important

$ git log --oneline
c3d4e5f (HEAD -> main) Feature C - very important
b2c3d4e Feature B - also important
a1b2c3d Feature A
9f8e7d6 Initial commit
# Everything is back!

Full Interactive Session - Detached HEAD Recovery:

# STEP 1: Enter detached HEAD state (exploring an old commit)
$ git checkout a1b2c3d
Note: switching to 'a1b2c3d'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

HEAD is now at a1b2c3d Feature A

# STEP 2: Make some commits while detached (oh no!)
$ echo "Experimental fix" > fix.py
$ git add fix.py
$ git commit -m "Experimental fix while exploring"
[detached HEAD d4e5f6g] Experimental fix while exploring
 1 file changed, 1 insertion(+)

$ echo "Another fix" >> fix.py
$ git commit -am "Another experimental fix"
[detached HEAD e5f6g7h] Another experimental fix
 1 file changed, 1 insertion(+)

# STEP 3: Switch back to main (this "loses" the detached commits!)
$ git checkout main
Warning: you are leaving 2 commits behind, not connected to
any of your branches:

  e5f6g7h Another experimental fix
  d4e5f6g Experimental fix while exploring

If you want to keep them by creating a new branch, this may be a good time
to do so with:

 git branch <new-branch-name> e5f6g7h

Switched to branch 'main'

# STEP 4: Oh no, I wanted those commits! Check reflog
$ git reflog
c3d4e5f (HEAD -> main) HEAD@{0}: checkout: moving from e5f6g7h to main
e5f6g7h HEAD@{1}: commit: Another experimental fix
d4e5f6g HEAD@{2}: commit: Experimental fix while exploring
a1b2c3d HEAD@{3}: checkout: moving from main to a1b2c3d

# STEP 5: Save those commits to a branch!
$ git branch experimental-fixes e5f6g7h
$ git log experimental-fixes --oneline
e5f6g7h (experimental-fixes) Another experimental fix
d4e5f6g Experimental fix while exploring
a1b2c3d Feature A
9f8e7d6 Initial commit
# Commits are now safely on a branch!

The Core Question You’re Answering

“If I delete a branch, where does the code go?”

It doesn’t go anywhere (immediately). The pointer (the sticky note) is destroyed, but the commit (the snapshot) remains in the database until garbage collected (usually 30 days).

Concepts You Must Understand First

Garbage Collection
- When does Git actually delete data? (git gc)
- Book Reference: “Pro Git” Ch. 10.7
Detached HEAD
- What does it mean to be “detached”? (HEAD points to Commit, not Branch Name).
- Book Reference: “Pro Git” Ch. 7.1 “Revision Selection”
Object Reachability
- Git only keeps objects that are “reachable” - meaning there’s a path from a reference (branch, tag, HEAD, reflog) to the object.
- When you delete a branch, the commits become “unreachable” from any branch, but they’re still reachable from the reflog.
- Once reflog entries expire AND gc runs, truly unreachable objects are deleted.
- You can visualize reachability: ```bash
  Show all reachable commits
  
  $ git rev-list –all
Show commits reachable from reflog but not from branches

$ git fsck –unreachable

Show dangling commits (unreachable from any ref including reflog)

$ git fsck –dangling ```
- Book Reference: “Pro Git” Ch. 10.7 “Maintenance and Data Recovery”
When Garbage Collection Actually Runs
- Git does NOT immediately delete unreachable objects. Here’s the timeline:
  - Reflog expiry: By default, reflog entries older than 90 days (for reachable commits) or 30 days (for unreachable commits) are pruned.
  - gc auto: Git runs gc --auto after certain operations (e.g., when loose objects exceed 6700).
  - Manual gc: You can force it with git gc --prune=now (DANGEROUS - deletes unreachable objects immediately).
- Check your gc configuration: ```bash $ git config –get gc.reflogExpireUnreachable 30.days.ago
$ git config –get gc.reflogExpire 90.days.ago

See when gc last ran

$ stat .git/gc.log 2>/dev/null || echo “No gc.log” ```
- Book Reference: “Pro Git” Ch. 10.7 “Maintenance and Data Recovery”
The Reflog is LOCAL Only
- The reflog is stored in .git/logs/ and is NOT pushed to remotes.
- Each clone has its own reflog tracking its own HEAD movements.
- This means you cannot recover someone else’s deleted branch using your reflog - only they can.
- Book Reference: “Pro Git” Ch. 10.3 “Git References”
HEAD vs Branch Pointers vs Tags
- HEAD: Points to “where you are now” - either a branch name (attached) or a commit SHA (detached).
- Branch: A mutable pointer to a commit. Moves when you commit.
- Tag: An immutable pointer to a commit. Never moves.
- Check what HEAD points to: ```bash
  See what HEAD points to
  
  $ cat .git/HEAD ref: refs/heads/main # Attached to branch ‘main’
Or in detached state:

$ cat .git/HEAD a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9 # Direct SHA ```
- Book Reference: “Pro Git” Ch. 10.3 “Git References”

Questions to Guide Your Design

Scenario: Create a repo with a linear history. Create a branch experiment. Commit something cool. Switch to main. Delete experiment.
The Panic: Simulate the realization that you deleted the wrong branch.
The Fix: Use git reflog to find the SHA-1 of the commit at the tip of experiment before deletion.

Thinking Exercise

Diagramming Reflog:

HEAD@{0}: checkout: moving from A to B
HEAD@{1}: commit: added file X
HEAD@{2}: reset: moving to HEAD~1

If you are at HEAD@{0}, how do you get back to the state at HEAD@{1}?

The Interview Questions They’ll Ask

“I did git reset --hard and lost my work. Can I get it back?”

Expected Answer:
- Yes, if you committed it - use git reflog to find the commit SHA and git reset --hard <sha> to recover.
- No, if it was only staged (in the index) or unstaged (working directory only) - uncommitted changes are not tracked by reflog.
- Partial recovery of staged changes might be possible with git fsck --lost-found which can find dangling blobs.
“What is the reflog?”

Expected Answer: The reflog is a local journal that records every time HEAD moves - commits, checkouts, resets, merges, rebases, etc. It’s stored in .git/logs/HEAD and acts as a safety net for recovering “lost” commits. Key points:
- Local only (not shared with remotes)
- Expires after 90 days (reachable) or 30 days (unreachable)
- Can be viewed with git reflog or git log -g
“What is a ‘detached HEAD’ state and how do I save changes I made while in it?”

Expected Answer: Detached HEAD means HEAD points directly to a commit SHA instead of a branch name. Any commits you make are not on any branch and will be “lost” when you checkout a branch. To save work made in detached HEAD:
```
# Before switching away, create a branch:
$ git branch my-saved-work

# Or if you already switched and need to recover:
$ git reflog
$ git branch my-saved-work <sha-from-reflog>
```
“What’s the difference between git reset --soft, --mixed, and --hard?”

Expected Answer: | Mode | HEAD | Index (Staging) | Working Directory | |——|——|—————–|——————-| | --soft | Moves | Unchanged | Unchanged | | --mixed (default) | Moves | Reset to match HEAD | Unchanged | | --hard | Moves | Reset to match HEAD | Reset to match HEAD |
- --soft: Uncommit but keep changes staged
- --mixed: Uncommit and unstage, but keep changes in working directory
- --hard: Uncommit and discard ALL changes (DESTRUCTIVE - but recoverable via reflog if committed)
“How long do I have to recover a deleted branch or reset commit?”

Expected Answer: By default, you have 30 days for unreachable commits (no branch points to them) before reflog entries expire. After that, git gc will permanently delete the objects. You can check and modify these settings:
```
$ git config gc.reflogExpireUnreachable  # Default: 30 days
$ git config gc.reflogExpire             # Default: 90 days
```

“What is git fsck and when would you use it?”

Expected Answer: git fsck (file system check) verifies the integrity of the Git database and finds unreachable objects. Use cases:

# Find dangling commits (not on any branch, not in reflog)
$ git fsck --dangling

# Find unreachable objects
$ git fsck --unreachable

# Recover lost staged changes (find dangling blobs)
$ git fsck --lost-found
# Creates .git/lost-found/other/ with recovered blobs

“How do you prevent accidentally losing work with git reset --hard?”

Expected Answer: Several strategies:
1. Always commit before resetting (even a WIP commit)
2. Use git stash instead of reset when you want to temporarily set aside changes
3. Use git reset --soft or --mixed instead of --hard when possible
4. Enable git config --global advice.detachedHead true for warnings
5. Create a backup branch before dangerous operations: git branch backup-before-reset
“Can you recover uncommitted changes that were lost?”

Expected Answer:
- Staged but not committed: Possibly. The blob exists in .git/objects/. Use git fsck --lost-found to recover dangling blobs.
- Never staged: No. Git never tracked it. Only external backup or filesystem recovery tools could help.
- Best practice: Commit early and often, even with “WIP” messages. You can always squash later.

Hints in Layers

Hint 1: Creating the “Loss” Commit something. Then run git reset --hard HEAD~1. Verify the file is GONE.

Hint 2: Viewing the Log git log won’t show the lost commit because it’s no longer part of the history chain reachable from HEAD. You need git reflog.

Hint 3: Recovery Once you find the hash (e.g., abc1234), you can git checkout abc1234 or git reset --hard abc1234.

Hint 4: Saving Detached HEAD If you make commits in detached HEAD, they have no branch pointing to them. To save them: git branch new-branch-name.

Hint 5: Common Gotcha - Reflog Only Shows HEAD Movements The reflog tracks HEAD movements, not all commits. If you’re looking for a commit that HEAD never pointed to (e.g., a commit on a remote that was force-pushed over), you won’t find it in your reflog.

# Each ref has its own reflog!
$ git reflog show main           # Reflog for main branch
$ git reflog show origin/main    # Reflog for remote-tracking branch
$ git reflog show HEAD           # Default - where HEAD has been

# If you can't find it in reflog, try:
$ git fsck --unreachable         # Find ALL unreachable objects
$ git fsck --dangling            # Find dangling (truly orphaned) objects

Hint 6: Common Gotcha - Fresh Clones Have No Reflog History When you clone a repository, you start with an empty reflog. You cannot recover “lost” commits that were lost before you cloned.

# After fresh clone:
$ git reflog
a1b2c3d (HEAD -> main, origin/main) HEAD@{0}: clone: from https://github.com/user/repo
# Only one entry!

# This is why shared recovery must happen on the machine where the commit was made.

Books That Will Help

Topic	Book	Chapter
Reflog	“Pro Git”	Ch. 7.6 “Revision Selection”
Reset Demystified	“Pro Git”	Ch. 7.7
Data Recovery	“Pro Git”	Ch. 10.7 “Maintenance and Data Recovery”
Git References	“Pro Git”	Ch. 10.3 “Git References”
Git Objects	“Pro Git”	Ch. 10.2 “Git Objects”

Project 3: “The History Surgeon” — Master Interactive Rebase

Attribute	Value
File	GIT_MASTERY_LEARNING_PROJECTS.md
Main Tool	Git CLI (Vim/Nano editor)
Difficulty	Advanced
Knowledge Area	History Rewriting
Main Book	“Pro Git” Chapter 3.6

What you’ll build: A simulated “bad history” containing typos, secrets committed by accident, and “WIP” commits. You will use git rebase -i to squash, edit, reorder, and drop commits to produce a pristine history.

Why it teaches Git: This is the difference between a junior and a senior dev. A junior pushes the mess. A senior curates the history. Interactive rebase gives you god-mode over your commit timeline.

Core challenges you’ll face:

Squashing: Combining “Fix typo” and “Update docs” into the original “Docs” commit.
Editing: Going back 5 commits to fix a typo in the code WITHOUT adding a new “fix” commit on top.
Splitting: Taking one massive commit and splitting it into two smaller ones.
Dropping: Completely removing a commit that added a simulated API key.

Key Concepts:

Rebase vs Merge: Replaying changes on top of a base.
Rewriting History: Changing SHA-1 hashes (and why you shouldn’t do this on shared branches).
The “Todo” List: The script Git generates for rebase.

Difficulty: Advanced Time estimate: 1 week Prerequisites: Projects 1 & 2

Real World Outcome

You will take a history that looks like this:

A - B - C(WIP) - D(Fix C) - E(Add Secret) - F(Remove Secret)

And transform it into:

A - B - C'(Finished Feature)

(Commit E/F are gone, C+D are combined).

Complete Terminal Session:

# Step 1: View the messy history
$ git log --oneline
8d7c6a2 (HEAD -> feature) Remove secret key
1a2b3c4 Oops added secret key
9e8d7c6 Fix typo in function
5f4e3d2 WIP implementing function
a1b2c3d Initial commit

# Step 2: Start interactive rebase (go back 4 commits)
$ git rebase -i HEAD~4

What You See in the Editor (Vim/Nano):

When you run git rebase -i HEAD~4, Git opens your default editor with a “todo” file:

pick 5f4e3d2 WIP implementing function
pick 9e8d7c6 Fix typo in function
pick 1a2b3c4 Oops added secret key
pick 8d7c6a2 Remove secret key

# Rebase a1b2c3d..8d7c6a2 onto a1b2c3d (4 commands)
#

# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup [-C | -c] <commit> = like "squash" but keep only the previous
#                    commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
#

# These lines can be re-ordered; they are executed from top to bottom.
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.

How to Edit in Vim: i to insert, Esc then :wq to save/quit, :q! to abort.

How to Edit in Nano: Edit directly, Ctrl+O to save, Ctrl+X to exit.

Your Edited Todo File (The Surgery Plan):

pick 5f4e3d2 WIP implementing function
fixup 9e8d7c6 Fix typo in function
drop 1a2b3c4 Oops added secret key
drop 8d7c6a2 Remove secret key

After Saving:

Successfully rebased and updated refs/heads/feature.

$ git log --oneline
3g4h5j6 (HEAD -> feature) WIP implementing function
a1b2c3d Initial commit

IMPORTANT: The commit hash changed from 5f4e3d2 to 3g4h5j6. Git created a NEW commit. The originals still exist in reflog for ~30 days.

The Core Question You’re Answering

“How do I fix a mistake I made 5 commits ago without adding a ‘fix’ commit on top?”

Concepts You Must Understand First

The Rebase Todo File Format

The todo file is a script with one command per line: <command> <hash> <message>

Command	Short	What It Does
`pick`	`p`	Use the commit as-is
`reword`	`r`	Use commit but edit the message
`edit`	`e`	Pause rebase to let you amend this commit
`squash`	`s`	Meld into previous commit, keep both messages
`fixup`	`f`	Meld into previous commit, discard this message
`drop`	`d`	Remove this commit entirely
`exec`	`x`	Run a shell command (e.g., run tests)
`break`	`b`	Pause here for manual inspection

Book Reference: “Pro Git” Ch. 7.6 “Rewriting History”

Why Commit Hashes Change

A commit’s SHA-1 hash is computed from: tree, parent hash, author info, committer info, and message.

If ANY of these change, the hash changes. During rebase:
- The parent pointer changes (new base)
- The committer timestamp changes (now)
- Therefore: ALL rebased commits get new hashes
```
BEFORE rebase onto main:
A -- B -- C (main)
      \
       D -- E -- F (feature)

AFTER rebase onto main:
A -- B -- C (main)
          \
           D' -- E' -- F' (feature)
```
D’, E’, F’ have the same content as D, E, F but different hashes.

Book Reference: “Pro Git” Ch. 10.2 “Git Objects”
The “Copy-Paste” Model of Rebase

Rebase does NOT move commits. It copies them:
1. Git saves your commits as patches
2. Git resets your branch to the new base
3. Git re-applies each patch one by one (creating new commits)
4. The old commits become orphaned (still exist in reflog)
Book Reference: “Pro Git” Ch. 3.6 “Rebasing”
Immutability of Commits
- You can’t change a commit. You can only create a new commit with changes. This changes the SHA-1.
- Book Reference: “Pro Git” Ch. 10.2
The Rebase Script Commands
- pick, reword, edit, squash, fixup, drop. What do they mean?
- Book Reference: “Pro Git” Ch. 7.6

Questions to Guide Your Design

Scenario: Commit a file config.txt with PASSWORD=password123.
The Mistake: Continue working, adding 3 more commits.
The Goal: Remove the password from history entirely.
Note: Merely editing the file in a new commit isn’t enough (it’s still in history!). You must drop or edit the old commit.

Thinking Exercise

Exercise 1: The Pancake Stack Model

Imagine you are picking up a stack of 4 pancakes (commits). You set them aside. You fix the bottom one. Then you place the other 3 back on top. Since the bottom one changed shape, the ones on top sit differently (their hashes change).

Exercise 2: Trace Commit Identity Through Rebase

Given:

main:    A -- B -- C
feature:      \-- D -- E -- F

After git rebase main from feature:

D’, E’, F’ ALL get new hashes because their parents changed
The CONTENT is the same, but the IDENTITY is different

Exercise 3: Predict Conflict Points

If commit C changed line 10 and commit D also changed line 10:

D’: CONFLICT at line 10
E’: Depends on E’s changes
F’: Depends on resolution of D’

Exercise 4: The Squash Sequence

Given:

pick abc1234 Add user model
squash def5678 Fix typo
pick jkl3456 Add controller
fixup mno7890 Fix bug

Result: 2 commits (squash combines first two, fixup combines last two)

The Interview Questions They’ll Ask

“When should you squash commits?”
- “WIP” commits followed by “finish feature”
- Typo fixes that belong to the original commit
- Before merging a feature branch
“Why is it dangerous to rebase a branch that has already been pushed to a shared repository?”
- Rebase rewrites commit hashes
- Others’ local commits point to OLD hashes
- After force-push, their history diverges
- Golden Rule: Never rebase shared branches
“What is the difference between squash and fixup?”
- squash: Combines commits AND messages (opens editor)
- fixup: Combines commits, discards the fixup’s message
“When should you NOT use rebase?”
- Never on public/shared branches
- Never on already-pushed commits (unless solo)
- Avoid rebasing merge commits
“What happens if you need to push after rebasing an already-pushed branch?”
```
$ git push --force-with-lease origin feature-branch
```
Use --force-with-lease (safer than --force) - only overwrites if no one else pushed.

“How do you recover from a bad rebase?”

$ git reflog  # Find pre-rebase state
$ git reset --hard HEAD@{5}  # Reset to that point
# Or immediately after:
$ git reset --hard ORIG_HEAD

“What is git rebase --onto?” Move a branch from one base to another:
```
$ git rebase --onto main feature my-branch
```
“Explain git rebase main vs git merge main.”
- Merge: Creates merge commit, preserves hashes, safe for shared
- Rebase: Replays commits, creates NEW hashes, linear history

Hints in Layers

Hint 1: The Command git rebase -i HEAD~n where n is how far back you want to go.

Hint 2: Removing a file from history Mark the commit as edit. When rebase pauses, git rm sensitive_file. Then git commit --amend. Then git rebase --continue.

Hint 3: Squashing Use squash (or s) on the second commit you want to merge into the first. The first one must stay pick.

Hint 4: The Escape Hatch

$ git rebase --abort  # Cancel and return to pre-rebase state

Hint 5: The Autosquash Feature

# Create commits with special prefixes:
$ git commit -m "fixup! Add user model"

# Then run with --autosquash:
$ git rebase -i --autosquash HEAD~5

# Enable by default:
$ git config --global rebase.autosquash true

Hint 6: Recovering from Bad Rebase

$ git reflog  # Find "rebase (start)" entry
$ git reset --hard ORIG_HEAD  # Or specific hash

Books That Will Help

Topic	Book	Chapter
Rewriting History	“Pro Git” by Scott Chacon	Ch. 7.6
Rebasing Basics	“Pro Git” by Scott Chacon	Ch. 3.6
Git Objects & Hashes	“Pro Git” by Scott Chacon	Ch. 10.2
Reset Demystified	“Pro Git” by Scott Chacon	Ch. 7.7

Project 4: “The Detective” — Master Bisect and Blame

Attribute	Value
File	GIT_MASTERY_LEARNING_PROJECTS.md
Main Tool	Git CLI + Automated Test Script
Difficulty	Intermediate
Knowledge Area	Debugging
Main Book	“Pro Git” Chapter 7.5

What you’ll build: A repository with 100 commits (generated by a script). One specific commit introduced a subtle bug (e.g., changed + to -). You will use git bisect to automate the search and find the culprit in O(log n) steps.

Why it teaches Git: git bisect is a superpower. It turns “I don’t know when this broke” into “Commit a1b2c3d broke it.”

Core challenges you’ll face:

Defining Good vs Bad: Identifying a known good state and a known bad state.
Automation: Writing a tiny script (test.sh) that returns exit code 0 (good) or 1 (bad) so Git can run the search automatically.
Blame: Once you find the commit, using git blame to see who and why.

Key Concepts:

Binary Search: How bisect works.
Exit Codes: How to tell Git if a commit is good or bad.

Difficulty: Intermediate Time estimate: Weekend Prerequisites: Basic shell scripting

Real World Outcome

You will find a needle in a haystack in seconds.

Complete Automated Bisect Session:

# STEP 1: Create a test repository with 100 commits (setup script)
$ mkdir bisect-demo && cd bisect-demo && git init
$ for i in {1..100}; do
    if [ $i -eq 67 ]; then
        # Introduce the bug at commit 67
        echo "result = a - b  # BUG: should be +" > calculator.py
    else
        echo "# Commit $i" >> history.txt
    fi
    git add -A && git commit -m "Commit $i"
done

# STEP 2: Create a test script that detects the bug
$ cat > test.sh << 'EOF'
#!/bin/bash
# Exit 0 = good (no bug), Exit 1 = bad (bug exists)
if grep -q "a - b" calculator.py 2>/dev/null; then
    exit 1  # BAD - bug exists
else
    exit 0  # GOOD - no bug
fi
EOF
$ chmod +x test.sh

# STEP 3: Start bisect
$ git bisect start
Status: waiting for both good and bad commits

# STEP 4: Mark current HEAD as bad (we know the bug exists now)
$ git bisect bad HEAD
Status: waiting for good commit(s), bad commit known

# STEP 5: Mark an early commit as good (we know it worked before)
$ git bisect good HEAD~99
Bisecting: 49 revisions left to test after this (roughly 6 steps)
[abc1234567890] Commit 50

# STEP 6: Run automated bisect with our test script
$ git bisect run ./test.sh
running './test.sh'
Bisecting: 24 revisions left to test after this (roughly 5 steps)
[def2345678901] Commit 75
running './test.sh'
Bisecting: 12 revisions left to test after this (roughly 4 steps)
[ghi3456789012] Commit 62
running './test.sh'
Bisecting: 6 revisions left to test after this (roughly 3 steps)
[jkl4567890123] Commit 69
running './test.sh'
Bisecting: 2 revisions left to test after this (roughly 2 steps)
[mno5678901234] Commit 66
running './test.sh'
Bisecting: 0 revisions left to test after this (roughly 1 step)
[pqr6789012345] Commit 67
running './test.sh'
pqr6789012345 is the first bad commit
commit pqr6789012345
Author: Developer <dev@example.com>
Date:   Thu Dec 26 10:30:00 2025 +0000

    Commit 67

 calculator.py | 1 +
 1 file changed, 1 insertion(+)

bisect found first bad commit

# STEP 7: Examine the culprit commit
$ git show pqr6789012345
commit pqr6789012345
Author: Developer <dev@example.com>
Date:   Thu Dec 26 10:30:00 2025 +0000

    Commit 67

diff --git a/calculator.py b/calculator.py
new file mode 100644
index 0000000..a1b2c3d
--- /dev/null
+++ b/calculator.py
@@ -0,0 +1 @@
+result = a - b  # BUG: should be +

# STEP 8: Use git blame to see line-by-line authorship
$ git blame calculator.py
pqr67890 (Developer 2025-12-26 10:30:00 +0000 1) result = a - b  # BUG: should be +

# STEP 9: End bisect and return to original branch
$ git bisect reset
Previous HEAD position was pqr6789... Commit 67
Switched to branch 'main'

Key Observations:

100 commits searched in only 6 steps (log2(100) ≈ 6.6)
Manual search would average 50 steps
Completely automated - no human intervention during search

The Core Question You’re Answering

“How do I find a bug when I have no idea where in the code it is, only that ‘it used to work’?”

Concepts You Must Understand First

Binary Search Algorithm
- Why checking the middle cuts the work in half
- Time complexity: O(log n) vs O(n) for linear search
- For 1000 commits: ~10 steps vs ~500 average for linear
Shell Exit Codes
- 0 means success (test passed = commit is GOOD)
- Non-zero means failure (test failed = commit is BAD)
- Git bisect relies entirely on these exit codes
Special Exit Codes for Bisect
- 125 = Skip this commit (cannot test, e.g., won’t compile)
- 128+ = Abort bisect (fatal error)
Book Reference: “Pro Git” Ch. 7.5 “Debugging with Git”

Questions to Guide Your Design

Setup: Write a script to generate 100 commits. Each commit appends a number to a file.
The Bug: At commit #67, introduce a change that breaks the pattern (or simple echo "error" > status.log).
The Test: Write a check script: grep "error" status.log && exit 1 || exit 0.

Thinking Exercise

Exercise 1: Calculate the Steps

If you have 1000 commits, how many steps does bisect take to find the bug?

Answer: log2(1000) ≈ 10 steps. Manual searching would take avg 500 steps.

Exercise 2: Manual Binary Search on Paper

Given commits numbered 1-16, where commit 11 introduced the bug:

Step 0: Range [1, 16], Test commit 8
        Result: GOOD (bug not present)
        New range: [9, 16]

Step 1: Range [9, 16], Test commit 12
        Result: BAD (bug present)
        New range: [9, 12]

Step 2: Range [9, 12], Test commit 10
        Result: GOOD
        New range: [11, 12]

Step 3: Range [11, 12], Test commit 11
        Result: BAD
        New range: [11, 11]

Found: Commit 11 is the first bad commit (4 steps for 16 commits)

Exercise: Do this yourself for commits 1-32 where commit 23 is bad. How many steps?

Exercise 3: What Happens with Skipped Commits?

If commit 8 won’t compile (can’t test), you mark it with git bisect skip:

Range [1, 16], Test commit 8
Result: SKIP (cannot test)
Git picks nearby commit 7 or 9 to test instead

This may add extra steps but bisect still works.

The Interview Questions They’ll Ask

“How do you find which commit introduced a regression?”

Expected Answer: Use git bisect with binary search:

$ git bisect start
$ git bisect bad HEAD      # Current is broken
$ git bisect good v1.0     # This version worked
# Then manually test each commit, or:
$ git bisect run ./test.sh # Automate with test script

“What does git bisect run do?”

Expected Answer: It automates the bisect process by running a script at each step. The script must return exit code 0 for “good” commits and non-zero for “bad” commits. Git uses binary search to find the first bad commit.

“What command shows line-by-line authorship of a file?”

Expected Answer: git blame <filename> shows who last modified each line:

$ git blame calculator.py
abc1234 (Alice 2025-01-15 10:30 +0000 1) def add(a, b):
def5678 (Bob   2025-01-16 14:22 +0000 2)     return a - b  # Bug!

“How do you handle commits that can’t be tested during bisect?”

Expected Answer: Use git bisect skip for commits that won’t compile or can’t be tested:
```
$ git bisect skip
# Or skip a range:
$ git bisect skip v2.0..v2.1
```
Git will try nearby commits instead. In automated mode, the test script should return exit code 125 to skip.
“How would you combine bisect with blame to debug an issue?”

Expected Answer:
1. Use git bisect to find the commit that introduced the bug
2. Use git show <commit> to see what changed
3. Use git blame <file> to see who wrote each line
4. Use git log -p <file> to see the history of changes to that file
This gives you: WHEN the bug was introduced, WHAT changed, and WHO made the change.

“How do you debug flaky tests with bisect?”

Expected Answer: Flaky tests are tricky because the same commit might pass or fail randomly. Strategies:

# Run the test multiple times in your test script:
#!/bin/bash
for i in {1..5}; do
    if ! ./run_test.sh; then
        exit 1  # BAD - at least one failure
    fi
done
exit 0  # GOOD - all 5 passed

Or use git bisect skip when results are inconsistent.

“What’s the difference between git blame and git log -p?”

Expected Answer:
- git blame <file>: Shows current state with last modifier per line
- git log -p <file>: Shows chronological history of all changes to file
- git log -S "search_term": Finds commits that added/removed this string
- git log --follow <file>: Follows file through renames

“Can you bisect to find when a bug was FIXED?”

Expected Answer: Yes! Use git bisect with terms reversed:

$ git bisect start --term-old=broken --term-new=fixed
$ git bisect broken HEAD~100  # Old version was broken
$ git bisect fixed HEAD       # Current version is fixed
# Finds the first commit where the fix appeared

Hints in Layers

Hint 1: Generation Script Use a loop in Bash/Python to commit changes:

for i in {1..100}; do
    echo $i >> file.txt
    git commit -am "Commit $i"
done

Hint 2: Bisect Manual Mode Start by manually running the test. git bisect good or git bisect bad. Watch Git jump around.

Hint 3: Bisect Auto Mode Once you trust the manual process, use git bisect run <command>.

Hint 4: The Skip Exit Code In your test script, return 125 to skip untestable commits:

#!/bin/bash
if ! make 2>/dev/null; then
    exit 125  # Skip - won't compile
fi
./run_tests.sh

Hint 5: Visualize the Bisect Range

$ git bisect visualize  # Opens gitk showing remaining range
$ git bisect log        # Shows bisect steps taken so far
$ git bisect replay <logfile>  # Replay a saved bisect session

Hint 6: Advanced Blame Options

$ git blame -L 10,20 file.py   # Only lines 10-20
$ git blame -C file.py         # Detect code moved from other files
$ git blame -w file.py         # Ignore whitespace changes
$ git blame --since="2 weeks ago" file.py  # Limit time range

Books That Will Help

Topic	Book	Chapter
Debugging with Git	“Pro Git” by Scott Chacon	Ch. 7.5 “Debugging with Git”
Binary Search Algorithm	“Introduction to Algorithms” by CLRS	Ch. 2.3
Git Bisect Internals	“Pro Git” by Scott Chacon	Ch. 10.1 “Plumbing and Porcelain”
Revision Selection	“Pro Git” by Scott Chacon	Ch. 7.1 “Revision Selection”

Additional Resources:

Resource	Description
`git help bisect`	Official documentation with all options
`git log -S "text"`	“Pickaxe” search - finds commits adding/removing text
`git log --bisect`	Show commits in bisect range

Project 5: “The Merge Conflict Dojo” — Master Merge and Remote Collaboration

Attribute	Value
File	GIT_MASTERY_LEARNING_PROJECTS.md
Main Tool	Git CLI + Diff Tool (optional)
Difficulty	Advanced
Knowledge Area	Collaboration
Main Book	“Pro Git” Chapter 3

What you’ll build: A simulated multi-user environment (using two local folders acting as different “computers”). You will create conflicting changes, push/pull, and resolve:

Content Conflicts: Same line changed differently.
Structural Conflicts: File deleted by one, modified by another.
Diverged History: Handling git pull --rebase vs git merge.

Why it teaches Git: Conflicts are where people quit Git. You will seek them out. You will understand the <<<<HEAD markers and how the “3-way merge” algorithm thinks.

Core challenges you’ll face:

Simulating Remotes: Cloning a local folder (git clone ../remote-repo).
The “Diverged” State: Your local main is at C, origin main is at D. You can’t push. What do you do?
Resolving deletions: User A renamed file.txt to data.txt. User B edited file.txt. Git is smart enough to merge this… mostly.

Key Concepts:

Remotes: origin/main is just a pointer to “where main was last time I checked.”
3-Way Merge: Ancestor, Yours, Theirs.
Fast-Forward: When no merge commit is needed.

Difficulty: Advanced Time estimate: 1 week Prerequisites: Project 1 & 2

Real World Outcome

You will confidently fix a state where git push is rejected. This section walks through the COMPLETE setup and resolution process.

Step 1: Create the Bare Repository (The “Server”)

# Create a bare repository to act as your remote server
$ mkdir -p ~/git-dojo && cd ~/git-dojo
$ git init --bare server.git
Initialized empty Git repository in /Users/you/git-dojo/server.git/

# A bare repo has no working directory - it's just the.git contents
$ ls server.git/
HEAD  config  description  hooks/  info/  objects/  refs/

Step 2: Clone as Two Different “Users”

# User A clones the repository
$ git clone server.git userA
Cloning into 'userA'...
warning: You appear to have cloned an empty repository.
done.

# User B clones the repository
$ git clone server.git userB
Cloning into 'userB'...
warning: You appear to have cloned an empty repository.
done.

# Now you have:
# ~/git-dojo/
#   ├── server.git/ (bare repo - the "remote")
#   ├── userA/ (Alice's working copy)
#   └── userB/ (Bob's working copy)

Step 3: User A Creates the Initial Commit

$ cd ~/git-dojo/userA
$ cat > calculator.py << 'EOF'
def add(a, b):
    return a + b

def subtract(a, b):
    return a - b

def multiply(a, b):
    return a * b
EOF

$ git add calculator.py
$ git commit -m "Initial calculator implementation"
[master (root-commit) a1b2c3d] Initial calculator implementation
 1 file changed, 11 insertions(+)
 create mode 100644 calculator.py

$ git push -u origin master
Counting objects: 3, done.
Writing objects: 100% (3/3), 267 bytes | 267.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To /Users/you/git-dojo/server.git
 * [new branch]      master -> master
Branch 'master' set up to track remote branch 'master' from 'origin'.

Step 4: User B Pulls and Both Users Edit the SAME Line

# User B pulls the initial code
$ cd ~/git-dojo/userB
$ git pull
remote: Enumerating objects: 3, done.
...
From /Users/you/git-dojo/server.git
 * branch            master     -> FETCH_HEAD

# User A modifies line 2 (the add function)
$ cd ~/git-dojo/userA
$ sed -i '' 's/return a + b/return a + b  # Addition/' calculator.py
$ git commit -am "Add comment to add function"
$ git push

# User B ALSO modifies line 2 (before pulling A's change!)
$ cd ~/git-dojo/userB
$ sed -i '' 's/return a + b/return int(a) + int(b)  # Type safety/' calculator.py
$ git commit -am "Add type casting to add function"

Step 5: The Conflict Moment

$ cd ~/git-dojo/userB
$ git push
To /Users/you/git-dojo/server.git
 ! [rejected]        master -> master (fetch first)
error: failed to push some refs to '/Users/you/git-dojo/server.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. Integrate the remote changes (e.g., 'git pull ...')
hint: before pushing again.

# THE PANIC MOMENT. But not for you.

Step 6: Visualize the Divergence

$ git fetch
$ git log --all --graph --oneline --decorate
* d4e5f6g (origin/master) Add comment to add function
| * 7h8i9j0 (HEAD -> master) Add type casting to add function
|/
* a1b2c3d Initial calculator implementation

Step 7: Resolution Option A - Merge

$ git merge origin/master
Auto-merging calculator.py
CONFLICT (content): Merge conflict in calculator.py
Automatic merge failed; fix conflicts and then commit the result.

# View the EXACT conflict markers
$ cat calculator.py
def add(a, b):
<<<<<<< HEAD
    return int(a) + int(b)  # Type safety
=======
    return a + b  # Addition
>>>>>>> origin/master

def subtract(a, b):
    return a - b

def multiply(a, b):
    return a * b

Understanding the Conflict Markers:

<<<<<<< HEAD
    [YOUR changes - what you committed locally]
=======
    [THEIR changes - what's on the remote]
>>>>>>> origin/master

Resolve by editing the file to keep BOTH improvements:

$ cat > calculator.py << 'EOF'
def add(a, b):
    return int(a) + int(b)  # Type safety + Addition

def subtract(a, b):
    return a - b

def multiply(a, b):
    return a * b
EOF

$ git add calculator.py
$ git commit -m "Merge: Combine type safety with comment"

# Graph AFTER merge resolution:
$ git log --all --graph --oneline --decorate
*   k1l2m3n (HEAD -> master) Merge: Combine type safety with comment
|\
| * d4e5f6g (origin/master) Add comment to add function
* | 7h8i9j0 Add type casting to add function
|/
* a1b2c3d Initial calculator implementation

$ git push
# Success!

Step 8: Resolution Option B - Rebase (Alternative)

If you prefer a LINEAR history instead of merge commits:

# Reset to try rebase instead
$ git reset --hard 7h8i9j0
$ git rebase origin/master
Auto-merging calculator.py
CONFLICT (content): Merge conflict in calculator.py
error: could not apply 7h8i9j0... Add type casting to add function
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add <pathspec>" and run "git rebase --continue".

# Same conflict markers appear - resolve the same way
$ cat > calculator.py << 'EOF'
def add(a, b):
    return int(a) + int(b)  # Type safety + Addition

def subtract(a, b):
    return a - b

def multiply(a, b):
    return a * b
EOF

$ git add calculator.py
$ git rebase --continue

# Graph AFTER rebase (LINEAR history - no merge commit):
$ git log --all --graph --oneline --decorate
* n4o5p6q (HEAD -> master) Add type casting to add function
* d4e5f6g (origin/master) Add comment to add function
* a1b2c3d Initial calculator implementation

$ git push
# Success!

Key Difference:

Merge preserves history divergence (diamond shape in graph)
Rebase creates linear history (straight line) by replaying your commits

The Core Question You’re Answering

“Why can’t I push? And what are these <<<<<<< symbols in my code?”

Concepts You Must Understand First

The 3-Way Merge Algorithm (CRITICAL)

Git does NOT compare “yours” vs “theirs” directly. It uses a 3-way merge involving:
```
ANCESTOR (Merge Base)
     |
     v
+---------+
| File v1 |  <-- The common ancestor commit
+---------+
   /    \
  /      \
 v        v
+----------+    +----------+
| File v2  |    | File v3  |
| (OURS)   |    | (THEIRS) |
+----------+    +----------+
```
How Git decides:
- If OURS == ANCESTOR and THEIRS changed –> Take THEIRS (they made the change)
- If THEIRS == ANCESTOR and OURS changed –> Take OURS (we made the change)
- If OURS == THEIRS (both changed identically) –> Take either (same result)
- If OURS != THEIRS and both != ANCESTOR –> CONFLICT (Git cannot decide)
Book Reference: “Pro Git” Ch. 3.2 “Basic Branching and Merging”

Merge Strategies: Recursive vs Octopus

Git uses different algorithms depending on the merge scenario:

Strategy	When Used	Description
recursive	Default for 2-branch merge	Handles multiple common ancestors by creating a virtual merge base recursively
octopus	Merging 3+ branches at once	Used for integrating many topic branches; refuses to do complex merges
ours	Explicitly requested	Keeps our version entirely, discards theirs (useful for “closing” branches)
subtree	Merging subprojects	Maps trees between repositories

# Force a specific strategy
$ git merge -s recursive feature-branch
$ git merge -s octopus branch1 branch2 branch3

Book Reference: “Pro Git” Ch. 7.8 “Advanced Merging”

Why Conflicts Happen at the Technical Level

A conflict occurs when Git’s merge algorithm encounters ambiguous intent:
```
Base (line 5):     result = calculate(x)

User A (line 5):   result = calculate(x + 1)    # Added +1
User B (line 5):   result = safe_calculate(x)   # Changed function name

Git's dilemma: Should the result be:
  - result = safe_calculate(x + 1) ?  (combine both)
  - result = calculate(x + 1) ?       (keep A's)
  - result = safe_calculate(x) ?      (keep B's)

Git CANNOT make this decision. It marks a CONFLICT.
```
Types of conflicts:
- Content conflict: Same lines changed differently
- Rename/rename conflict: Both renamed the same file differently
- Modify/delete conflict: One edited, one deleted
- Add/add conflict: Both created a file with same name, different content
Book Reference: “Pro Git” Ch. 7.8 “Advanced Merging - Merge Conflicts”

The Merge State Files: MERGE_HEAD, ORIG_HEAD, and More

During a merge conflict, Git creates special reference files:

$ ls .git/
MERGE_HEAD    # SHA of the commit being merged INTO your branch
MERGE_MODE    # Indicates a merge is in progress
MERGE_MSG     # Pre-populated commit message for the merge
ORIG_HEAD     # Where HEAD was BEFORE the merge started (safety backup)

Using these files:

# See what commit is being merged in
$ cat .git/MERGE_HEAD
d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9

# Abort and return to ORIG_HEAD
$ git merge --abort
# This is equivalent to: git reset --hard ORIG_HEAD

# See the prepared merge message
$ cat .git/MERGE_MSG
Merge branch 'feature-x' into master

# Conflicts:
#       calculator.py

Book Reference: “Pro Git” Ch. 10.3 “Git References”

Tracking Branches
- What is the link between local main and origin/main?
- origin/main is a remote-tracking branch - a local pointer that remembers where main was on origin the last time you fetched/pulled.
- Book Reference: “Pro Git” Ch. 3.5 “Remote Branches”
Fetch vs Pull
- pull = fetch + merge (or rebase with --rebase).
- fetch is SAFE - it only updates remote-tracking branches, never your working directory.
- Always know what your tools do.
- Book Reference: “Pro Git” Ch. 2.5 “Working with Remotes”

Fast-Forward vs True Merge

FAST-FORWARD (no divergence):

A---B---C (main)
         \
          D---E (feature)

After merge: A---B---C---D---E (main, feature)
No merge commit needed - just move the pointer.

TRUE MERGE (diverged histories):

A---B---C---F (main)
         \
          D---E (feature)

After merge: A---B---C---F---G (main)
                      \ /
                       D---E (feature)
G is a MERGE COMMIT with TWO parents.

Book Reference: “Pro Git” Ch. 3.2 “Basic Branching and Merging”

Questions to Guide Your Design

Setup: Create a bare repo server.git. Clone it to userA and userB folders.
Conflict 1: User A changes line 1. Commit. Push. User B changes line 1. Commit. Push (fails). Pull (conflict).
Conflict 2: User A deletes file X. User B modifies file X.

Thinking Exercise

Exercise 1: The 3-Way Merge Decision Tree

Scenario A - Conflict:

Ancestor: A = 1
Yours:    A = 2
Theirs:   A = 3

Git cannot decide. CONFLICT.

Scenario B - Auto-resolve (theirs wins):

Ancestor: A = 1
Yours:    A = 1   (no change)
Theirs:   A = 3   (changed)

Git assumes 3 is correct (change wins over no-change).

Scenario C - Auto-resolve (ours wins):

Ancestor: A = 1
Yours:    A = 2   (changed)
Theirs:   A = 1   (no change)

Git takes 2 (our change wins).

Scenario D - Auto-resolve (both same):

Ancestor: A = 1
Yours:    A = 2
Theirs:   A = 2   (both made same change)

Git takes 2 (no conflict - both agree).

Exercise 2: Draw the Merge Base Calculation

Given this commit graph, identify the merge base:

      A---B---C---D (feature)
     /
E---F---G---H---I (main)

Question: What is the merge base when merging feature into main?

Answer: Commit F is the merge base - it’s the most recent common ancestor.

Now consider a more complex scenario:

        A---B---C (feature)
       /         \
  E---F---G---H---M (main, merged feature once)
       \         /
        J---K---L (hotfix, also merged)

Question: If you merge feature into main again after M, what’s the merge base?

Answer: The merge base is now C (not F), because C was already incorporated via merge commit M. Git’s recursive strategy handles this by computing virtual merge bases.

Exercise 3: Predict the Merge Strategy

For each scenario, predict which merge strategy Git will use by default:

Merging two branches with one common ancestor
- Strategy: recursive (default for 2-way merge)
Merging three feature branches at once (git merge feat1 feat2 feat3)
- Strategy: octopus (for 3+ branches with no conflicts)
Branches with multiple common ancestors (criss-cross merge history)
- Strategy: recursive with virtual merge base computation
Rebasing instead of merging
- Strategy: Not a merge strategy - rebase replays commits one by one

The Interview Questions They’ll Ask

“What is the difference between git merge and git rebase?”

Expected Answer:
- Merge combines two branches by creating a new “merge commit” with two parents. History shows the divergence.
- Rebase replays your commits on top of another branch, rewriting history to appear linear.
- Use merge for shared/public branches (preserves history). Use rebase for local cleanup before pushing.
“What is a fast-forward merge?”

Expected Answer: A fast-forward merge happens when there’s no divergence - the target branch is a direct ancestor of the source. Git simply moves the pointer forward without creating a merge commit.
```
# To prevent fast-forward and always create a merge commit:
$ git merge --no-ff feature-branch
```

“How do you abort a merge that went wrong?”

Expected Answer:

$ git merge --abort  # During a merge with conflicts
$ git reset --hard ORIG_HEAD  # After a merge commit (undo it)

“Explain the 3-way merge algorithm.”

Expected Answer: Git compares three versions: the common ancestor (merge base), your version (ours), and their version (theirs). If only one side changed a line, Git takes that change. If both sides changed the same line differently, Git cannot decide and marks a conflict.
“When would you use merge vs rebase in a team workflow?”

Expected Answer:
- Merge: For integrating feature branches into main/shared branches. Creates an explicit merge point. Safe for collaboration.
- Rebase: For updating your local feature branch with latest main before pushing. Creates clean linear history. Never rebase commits that have been pushed to a shared repository.
“What’s a merge commit vs a fast-forward?”

Expected Answer:
- Merge commit: A commit with two parents that represents the joining of two branches. Created when branches have diverged.
- Fast-forward: No new commit created. The branch pointer simply moves forward because there’s no divergence.
“How do you handle repeated merge conflicts (same conflict over and over)?”

Expected Answer: Use git rerere (Reuse Recorded Resolution):
```
$ git config --global rerere.enabled true
```
Git will remember how you resolved conflicts and automatically apply the same resolution next time.
“What is rerere and when would you use it?”

Expected Answer: rerere = “REuse REcorded REsolution”. When enabled, Git records your conflict resolutions. If the same conflict appears again (common in long-running feature branches with frequent rebases), Git automatically applies your previous resolution. Essential for teams doing frequent rebases or merge-heavy workflows.

“How do you resolve a conflict where one person deleted a file and another modified it?”

Expected Answer: This is a modify/delete conflict. You must decide:

# To keep the file (accept the modification):
$ git add path/to/file.txt

# To delete the file (accept the deletion):
$ git rm path/to/file.txt

# Then continue
$ git commit  # or git rebase --continue

“What happens to merge conflicts during a rebase vs a merge?”

Expected Answer:
- Merge: You resolve all conflicts once, in a single merge commit.
- Rebase: You may have to resolve the same logical conflict multiple times (once per replayed commit). This is why rerere is valuable during rebases.

“How do you see what changes are coming from each side of a conflict?”

Expected Answer:

$ git diff --ours    # What we changed from the ancestor
$ git diff --theirs  # What they changed from the ancestor
$ git diff --base    # Differences from the common ancestor
$ git log --merge    # Commits involved in the conflict

Hints in Layers

Hint 1: Local Remotes You don’t need GitHub. git clone ./my-repo ./my-repo-clone works perfectly on your local disk.

# Create a bare repo (like GitHub would have)
$ git init --bare ~/repos/project.git

# Clone it twice to simulate two developers
$ git clone ~/repos/project.git ~/dev/alice
$ git clone ~/repos/project.git ~/dev/bob

Hint 2: Viewing Conflicts git status tells you exactly which files are unmerged.

$ git status
On branch master
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths:
  (use "git add <file>..." to mark resolution)
        both modified:   calculator.py

Hint 3: Using git mergetool Configure a visual merge tool to make conflict resolution easier:

# Set up a merge tool (examples)
$ git config --global merge.tool vimdiff
$ git config --global merge.tool vscode
$ git config --global mergetool.vscode.cmd 'code --wait $MERGED'

# During a conflict, launch the tool:
$ git mergetool

# The tool shows a 3-pane or 4-pane view:
# LOCAL (yours) | BASE (ancestor) | REMOTE (theirs)
#                     MERGED (result)

Hint 4: The rerere Feature (REuse REcorded REsolution) Enable rerere to remember how you resolve conflicts:

# Enable globally
$ git config --global rerere.enabled true

# How it works:
# 1. First time you hit a conflict, you resolve it manually
# 2. Git records the resolution in.git/rr-cache/
# 3. Next time the SAME conflict appears, Git auto-applies your resolution

# View recorded resolutions
$ ls .git/rr-cache/

# Forget a recorded resolution
$ git rerere forget path/to/file.txt

Hint 5: Using --ours and --theirs for Quick Resolution When you know which side should “win” entirely:

# During a merge conflict, accept OUR version entirely:
$ git checkout --ours path/to/file.txt
$ git add path/to/file.txt

# Or accept THEIR version entirely:
$ git checkout --theirs path/to/file.txt
$ git add path/to/file.txt

# WARNING: During REBASE, ours/theirs are SWAPPED!
# In rebase: --ours = the branch you're rebasing onto
#            --theirs = your commits being replayed

Hint 6: Handling Binary File Conflicts Binary files (images, PDFs, compiled files) cannot be merged line-by-line:

$ git merge feature-branch
warning: Cannot merge binary files: logo.png (HEAD vs. feature-branch)
CONFLICT (content): Merge conflict in logo.png
Automatic merge failed; fix conflicts and then commit the result.

# You must choose one version or the other:
$ git checkout --ours logo.png    # Keep our version
# OR
$ git checkout --theirs logo.png  # Keep their version

$ git add logo.png
$ git commit

For images, you might use a visual diff tool:

# Configure an image diff tool
$ git config diff.image.command 'compare $LOCAL $REMOTE png:- | montage -label "Local" $LOCAL -label "Remote" $REMOTE -label "Diff" png:- -tile 3x1 -geometry 300x300+4+4 - | display'

Hint 7: Rebase Flow vs Merge Flow When pulling with conflicts, git pull --rebase is often cleaner than git pull (which makes a merge commit).

# Merge flow (creates merge commits)
$ git pull origin master
# Results in diamond-shaped history

# Rebase flow (linear history)
$ git pull --rebase origin master
# Results in straight-line history

# Make rebase the default for pulls
$ git config --global pull.rebase true

Hint 8: Previewing What Will Be Merged Before merging, see what changes are coming:

# See which commits will be merged
$ git log HEAD..origin/master --oneline

# See the actual diff
$ git diff HEAD...origin/master

# Dry-run merge to see if there will be conflicts
$ git merge --no-commit --no-ff origin/master
$ git diff --cached  # See what would be merged
$ git merge --abort  # Cancel and go back

Hint 9: The Nuclear Option - Starting Over If everything goes wrong:

# During a merge
$ git merge --abort

# During a rebase
$ git rebase --abort

# After a commit you regret
$ git reset --hard ORIG_HEAD

# Complete reset to remote state (DESTRUCTIVE)
$ git fetch origin
$ git reset --hard origin/master

Books That Will Help

Topic	Book	Chapter
Basic Branching & Merging	“Pro Git” by Scott Chacon	Ch. 3.2 - “Basic Branching and Merging”
Remote Branches	“Pro Git” by Scott Chacon	Ch. 3.5 - “Remote Branches”
Rebasing	“Pro Git” by Scott Chacon	Ch. 3.6 - “Rebasing”
Distributed Workflows	“Pro Git” by Scott Chacon	Ch. 5.1 - “Distributed Workflows”
Contributing to a Project	“Pro Git” by Scott Chacon	Ch. 5.2 - “Contributing to a Project”
Maintaining a Project	“Pro Git” by Scott Chacon	Ch. 5.3 - “Maintaining a Project”
Advanced Merging	“Pro Git” by Scott Chacon	Ch. 7.8 - “Advanced Merging”
Rerere	“Pro Git” by Scott Chacon	Ch. 7.9 - “Rerere”

Additional Resources:

Resource	Description	URL
Git Documentation	Official merge documentation	`git help merge`
Atlassian Git Tutorials	Visual merge conflict guide	atlassian.com/git/tutorials
Git Internals PDF	Deep dive into merge algorithms	Available on git-scm.com

Pro Git Chapter 5 Deep Dive (Distributed Workflows):

This chapter is essential for understanding how teams collaborate. Key sections:

5.1 Distributed Workflows
- Centralized Workflow (single shared repo)
- Integration-Manager Workflow (fork + pull request model)
- Dictator and Lieutenants Workflow (Linux kernel model)
5.2 Contributing to a Project
- Commit guidelines
- Private small team workflows
- Private managed team workflows
- Forked public project workflows
5.3 Maintaining a Project
- Working in topic branches
- Applying patches from email
- Determining what is introduced
- Integrating contributed work
- Tagging your releases
- Generating a build number

Pro Git Chapter 7.8 Deep Dive (Advanced Merging):

This chapter covers the technical details of merge strategies:

Merge Conflicts - Understanding the conflict markers
Undoing Merges - git reset vs git revert -m 1
Merge Strategies - recursive, octopus, ours, subtree
Merge Options - --ignore-space-change, --ignore-all-space, -Xours, -Xtheirs

Summary

This learning path covers Git Mastery through 5 hands-on projects.

#	Project Name	Main Tool	Difficulty	Time Estimate
1	The Precision Surgeon	Git CLI	Beginner	Weekend
2	The Time Traveler	Git CLI	Intermediate	Weekend
3	The History Surgeon	Git CLI	Advanced	1 week
4	The Detective	Git CLI	Intermediate	Weekend
5	The Merge Conflict Dojo	Git CLI	Advanced	1 week

Recommended Learning Path

For beginners: Start with Project 1 to stop using git add . blindly. For intermediate: Jump to Project 3 to master rewriting history. For advanced: Focus on Project 5 to simulate complex team conflicts.

Expected Outcomes

After completing these projects, you will:

Never fear a detached HEAD again.
Know how to find lost code using reflog.
Be able to clean up messy history before pushing.
Debug regressions automatically with bisect.
Resolve merge conflicts without losing data.

You’ll have built a mental model of the Git Graph that allows you to predict exactly what every command will do.

Learn Git Mastery: From User to Git Guru

Why Git Mastery Matters

Core Concept Analysis

1. The Mental Model: The Graph

2. The Three Areas (The Stage)

3. Movement vs. Change

Git Object Model

The Four Object Types

Blobs: Content Without Identity

Trees: The Directory Snapshot

Commits: Snapshots in Time

The SHA-1 Hash: Content Addressing

The Directed Acyclic Graph (DAG)

Visualizing Objects with git cat-file

How Git Stores Data

The .git Directory Structure

Loose vs Packed Objects

The Index (Staging Area)

How a Commit is Created

Remote Tracking Explained

What Are Remote-Tracking Branches?

The Mental Model

Fetch, Pull, and Push Explained

Tracking Relationships

The Complete Picture

Concept Summary Table

Deep Dive Reading by Concept

Concept 1: The Basics & History

Concept 2: Branching & Merging

Concept 3: Tools & Debugging

Concept 4: Undo Operations & Recovery

Concept 5: Collaboration Workflows

Concept 6: Git Hooks & Automation

Concept 7: Security in Git

Essential Reading Order

Project List

Project 1: “The Precision Surgeon” — Master Staging and Committing

Real World Outcome

The Core Question You’re Answering

Concepts You Must Understand First

Before staging

Stage a file

After staging - one more object!

See what’s in the index

Questions to Guide Your Design

Thinking Exercise

The Interview Questions They’ll Ask

Hints in Layers

Books That Will Help

Project 2: “The Time Traveler” — Master Navigation and Reflog

Real World Outcome

The Core Question You’re Answering

Concepts You Must Understand First

Show all reachable commits

Show commits reachable from reflog but not from branches

Show dangling commits (unreachable from any ref including reflog)

See when gc last ran

See what HEAD points to

Or in detached state:

Questions to Guide Your Design

Thinking Exercise

The Interview Questions They’ll Ask

Hints in Layers

Books That Will Help

Project 3: “The History Surgeon” — Master Interactive Rebase

Real World Outcome

The Core Question You’re Answering

Concepts You Must Understand First

Questions to Guide Your Design

Thinking Exercise

Exercise 1: The Pancake Stack Model

Exercise 2: Trace Commit Identity Through Rebase

Exercise 3: Predict Conflict Points

Exercise 4: The Squash Sequence

The Interview Questions They’ll Ask

Hints in Layers

Books That Will Help

Project 4: “The Detective” — Master Bisect and Blame

Real World Outcome

The Core Question You’re Answering