Basics like initializing (a repository), staging and commiting files aren’t explained here; they simply make sense; no ‘Aha!’s there. Moving references, branching and merging — coupled with Git’s arcane command names — are the confusing parts.

Basics

Git is a distributed VCS; each repo can be both a server/client

Honestly, git (sub)commands are just graph manipulating commands

(sub)commands are just graph manipulating commands Every codebase is made of a graph; each commit is a node with edges to parent(s) Git diagrams often have arrows backwards (←) for this reason

Git stores snapshots not differences i.e. entire file contents — as a blob Every commit is a complete snapshot of tracked repo contents + (0 or more) parent ID(s) identified with a 40-byte SHA-1 hash This way, the exact state of your project can be referred to, copied, or restored at any time



“finally figuring out that git commands are strangely named graph manipulation commands – creating/deleting nodes, moving around pointers” – Kent Beck

Nodes of the graph are created by your commits

Nodes are never really deleted in the traditional sense; they’re made unreachable (see below) These unreachable nodes eventually get garbage collected by Git



Reachablity

A---B---C / D---E---F---G \ H---I

An important (linked-list) concept that applies to Git (too)

If the first node is lost, the list, too, is lost.

Since a commit also has parent commit(s) (except root), following the chain of parents will eventually take you back to the beginning of the project

In a well-branched graph, depending on the leaf node you start from, different parts of the graph will be reachable

Commit X is “reachable” from commit Y if commit X is an ancestor of commit Y In the above example, A , B and C are unreachable from G , so are F and G when starting from C or B or A

The gc subcommand walks the graph, building a list of every commit it can reach; removes unreachable ones Will clear-up disk space; no good reason to run it often Some Git subcommands may run it automatically too!

subcommand walks the graph, building a list of every commit it can reach; removes unreachable ones

References

“References make commits reachable” – Think like a Git

Plainly, references are “meaningful” names to some commits They facilitate easy git-speak with your friends/colleagues 😜 Branches and tags are references too

Creating a branch reference is a way to “nail down” part of the graph that you want to return to later (reachability)

reference is a way to “nail down” part of the graph that you want to return to later (reachability) References are just reference-named files containing a 40-byte commit ID They’re specific to a single repository Remote references are local, remote-tracking references to a commit in a remote repository

There’re many more ways of referring to commits: man gitrevisions is your friend Collectively called commit-ish

is your friend

Commands Affecting Refs

These are the primary subcommands that allow you to move refs directly:

commit

merge

rebase

reset

Subcommands that affect moving remote refs:

fetch

push

Commands like pull , cherry-pick , … work atop these.

Checkout vs Reset

Before getting into the details, here’s the gist

checkout mostly operates on the working tree, while reset operates on index.

To understand both commands, you first need to understand HEAD . Most people know about the working tree and stating area but not HEAD .

HEAD references the currently checked out commit; your working tree will mostly be from this snapshot – the commit pointed to by HEAD . Pro Git summarizes this nicely

HEAD will be the parent of the next commit that is created.

Checkout

git checkout HEAD -- file

When you checkout a file from HEAD , what you do is get a clean copy of file from the commit HEAD is pointing to; this replaces your working tree copy. Of course, one could use other refs too, HEAD is just a convenient default, you can replace it with any ref; if HEAD is omitted, it’ll be from index — the stage.

git checkout topic

When you checkout a branch (reference to a commit/node) e.g. topic , HEAD will be set to its tip commit and hence the entire working tree, not just a file, will be from the commit that branch is pointing to.

Reset

Plainly, reset moves HEAD around. It’s used to move HEAD to a given commit. There’re different flavours of doing this — depending on what happens to the index and working tree ( --hard , --soft , --mix …) — but the crux is to move HEAD .

But isn’t that what checkout does too? Yes, but with a difference. Quoting Pro Git, with my emphasis

reset will […] move what HEAD points to. This isn’t the same as changing HEAD itself (which is what checkout does); reset moves the branch that HEAD is pointing to.

Caveat: with reset , HEAD moves the branch reference along with it, only if it’s attached.

Detached HEAD

Whoa! Slow down there, cowboy. Before talking about detached, what’s the attached state of HEAD ? We already know that HEAD is just a reference to a commit. Say this commit also has another reference pointing to it: a branch name.

When HEAD is moved by reset , if it’s attached to a branch, that reference too will move with HEAD .

C1 <-- C2 <-- C3 <-- C4 <-- C5 <-- master ^ | HEAD git reset -- hard C3

This would move both HEAD and master to C3 . HEAD would continue to be attached. Now if it weren’t attached, it’ll only move HEAD leaving master behind, hence the detached HEAD state.

In its detached state, HEAD refers to a specific commit as opposed to referring to a named branch. Like Git’s diagnostic message says, it’s useful to poke around and inspect the code base at a particular commit. Making a new commit now would mean a commit only pointed to by HEAD .

There’re a couple of ways to identify if HEAD is detached. git status ’s very first line will tell you:

> git status On branch master … > git status # HEAD detached at 847 fe59

Another way is to use git log ; I learnt from this actually.

> git log -- oneline -5 847 fe59 ( HEAD -> master ) Initial commit … > git log -- oneline -5 847 fe59 ( HEAD , master ) Initial commit

Notice that when HEAD is attached, you see an arrow (→) pointing to the branch it’s attached to. However, in the detached state they’re listed as independent items.

Attach/Detaching HEAD

How do we attach or detach HEAD to a reference? Both are done with checkout , but with a subtle difference. To attach HEAD , you’d checkout

> git checkout master

When you checkout a commit using anything other than a branch name, you’d detach HEAD e.g. commit ID, HEAD~1, branch~3, HEAD{5}, HEAD^^ , etc. Since it wouldn’t know what to associate HEAD with, Git detaches HEAD . When you want to inspect the code base at a particular unnamed – except for its commit ID – commit, this is what you normally do.

> git checkout lk3nw7ef

Here, it doesn’t matter if this commit has other branch references to it. Since you referred to it using the raw commit ID, Git takes it as a cue to detach HEAD .

Practise

I highly recommend playing around in Visualizing Git with checkout , reset ; also get your hands dirty with the whole attach/detach business. Here’s a small snippet to get you started; see what happens as each command gets executed:

git commit git commit git commit git commit git commit # create topic branch and checkout ; HEAD now attached to topic git checkout - b topic # move HEAD one commit behind topic ; this will also move topic with HEAD git reset topic ~ 1 # detach HEAD! git checkout HEAD ~ 2 # attach to master git checkout master # move back master by 3 git reset master ~ 3 # move master forward / backward with commit ID git reset f08ad6

Rebase

rebase seems to have a scary reputation on the web, with good reason of course. It’s infamous for rewriting history; something your teammates mightn’t take kindly. However, when you’re doing this only locally, within your repo, before pushing, it’s a great tool.

The crux of a rebase : given a subgraph’s root node, rebase changes its parent pointer from one node to another; thereby rebasing the entire subgraph to a new parent.

Take note, a commit is not just its contents but also includes its parent(s). So any kind of rebase entails — since the parent/lineage is changed — a change of commit ID for the same commit contents.

Interactive rebase ( rebase -i ) is quite useful. I frequently use it to amend (not just the recent commit), fix, reword, edit, drop or squash commits. During an interactive rebase, one can even create multiple commits as usual and continue with the rebase; things will be taken care of! This is normal when dividing a commit into smaller parts.

pull = fetch + merge rebase? 🤔

When pulling from a remote branch, you might know that your changes are unrelated to the ones coming down. In this case, to avoid a merge commit and have a linear commit history, you’d pass --rebase do override the default merge strategy of pull : merge.

git pull -- rebase origin master

git pull is just git fetch followed by git merge which creates a new merge commit. git pull --rebase , however, is git fetch and git rebase ; it pulls commits from remote to your current branch and then replay your commits atop your current branch’s tip – this works if there’re no merge conflicts; otherwise you’ve to resolve conflicts as you’d normally. The resolution (changes) become a part of one of your commits where rebase halted; you’d end up re-writing your commit. However, you don’t have to force push your changes to the remote since the resolution just happened in your local commits. Rewriting (commit) history, as long as it is not public, is OK 😉

A counter point to pull-with-rebase: if you want logical separation of a set of commits, say for a completely new feature, then rebase — which makes them inline, muddled with unrelated history — isn’t the right tool; use merge instead.

Use git pull --rebase when your changes do not deserve a separate branch.

seems to be the appropriate answer to when should I git pull --rebase .

See Also

I get surprised by Git commands every now and then, I document the obscure but useful ones!

Learn by Doing

try.github.io for good DIY resources.

Visualizing Git – lets you visualize your git commands

Visualizing Git Concepts with D3 – explains commands with interactive images

A Visual Git Reference – explains commands with images

References