Our git log will output the following:

* 720ae00 (HEAD -> master) Second commit

* 347c154 First commit

As expected we have two commits with an hash id and as you may notice we have (HEAD -> master) in the second commit.

The master as you may know is the current branch you are in.

HEAD is just a pointer to the checkout point. So our current checkout point is the master branch signal by the -> after HEAD. Let’s checkout the first commit:

> git checkout 347c154

> git lg --all * 720ae00 (master) Second commit

* 347c154 (HEAD) First commit

Performing a git checkout moves the HEAD pointer to a new commit and consequently this commit state will reflect in your project folder.

As you may see our HEAD now it’s pointing to our first commit, in a detached state. Being in a detached state only means that HEAD is not pointing to a branch.

But what is a branch?

Well conceptually a branch is any deviation in the git database. As I said previously git is just a database of commits built on top of each other and this represents an acyclic directed graph. If you want to know more about graphs check this amazing article of Vaidehi Joshi:

scenario where we have 3 branches

In our previously example we only had one branch, the master branch. In the previous figure we see three branches, because there are 2 deviance points in commit 2 and commit 3.

How git stores all this? The shocking true is that a branch is just a reference/pointer to a commit. If you think about it, being in a presence of a graph, if you know the last commit you will know all is ancestors. But this is only the start.

how git represents a commit

Besides being a reference a branch is also scope, every time your HEAD points to a branch after a commit, the HEAD and consequently the branch will point automatically to the new commit. By another hand git reset does the same thing but backwards.

So:

> git checkout branch-2 # our HEAD is pointing to branch-2

> git reset --hard <hash-commit-7> # reset HEAD -> master to commit7

So we will end with this graph:

git graph after a branch-2 reset to commit 7

Do you see how the branch-2 is now pointing to commit 7. But wait commit 8 is still there, no I didn’t forgot commit 8 when I made this figure. Git commits aren’t erased after a reset, so when you do this you don’t loose any version.

In the next topic we are going to see something that you are not used to see every days. Let’s use some out of the box commands to understand what is happening under the hood.

Going behind the git atomic point of interaction — the commit

So we already saw that git is nothing more than a database of several commits, but the question is:

How the hell a commit can be materialised into a version of your project?

Is not something you have to worry about, but knowing how it works will make one more point to trust in git:

Let’s back to our scenario where we had two commits:

> git log * 720ae00 (HEAD -> master) Second commit

* 347c154 First commit

Now I am going to use a command that may be new for you git cat-file, there is:

p # pretty print

t # type

s # size

Let’s see the type of our first commit:

> git cat-file -t 347c154

commit

As expected it’s a commit.

I hope you are excited, because right now we are going to see the content of a commit, an aha moment:

> git cat-file -p 347c154

author **** <****

committer **** < tree 99d89c723829ec2352809c52e507b2a46119a948author **** @gmail.com > 1528907497 +0100committer **** < ****@gmail.com > 1528907497 +0100 First commit

Hum very curious, we have a tree, an author, a committer and the message of the commit. I’m intrigued with the tree, let’s confirm that this is a tree:

> git cat-file -t 99d89c723829ec2352809c52e507b2a46119a948

tree

Now let’s see the content:

100644 blob d34fd419c2aed7ceee1247760bb2b951df961959 A

100644 blob b0b9fc8f6cc2f8f110306ed7f6d1ce079541b41f B

Cool, we see that the tree it’s actually the tree of our files A and B. We have the first digits reserved for file permissions a blob reference and the name of the file.

Let’s see the content of the blob associated with the name A:

> git cat-file -p d34fd419c2aed7ceee1247760bb2b951df961959

A simple file

A blob is a snapshot of a file content.

Congratulations we traverse the git commit trough the blob and now you know how git stores is objects. By the way all this magic is stored on the .git folder of your project.

But wait how does a git knows where it comes from? We only seen the first commit, the start point of our project. Let’s see the contents of the second commit:

> git cat-file -p 720ae00

parent 347c154abd66d914ea067f749aabb23cd1b07c77

author *** <

committer *** < tree e30ead0c3157e6e2a355e49cdb76ec67f6542d21author *** < ***@gmail.com > 1528907929 +0100committer *** < ***@gmail.com > 1528907929 +0100 Second commit

So in the second commit we have something we didn’t have in the first, a parent reference. Every commit stores is ancestor and there are some special commits ‘merge commits’ that store 2 parents.

We can also see that although we didn’t made any change to the structure of our project tree, the tree ID changed. This is because every object in git is immutable, so a blobs ID change implies a tree change.

Let me leave you with a schema representing what we’ve seen in this section:

git storing file versions

As a bonus I will going to present a very curious case, to see how well designed is git. Let me copy A file into a new file called C. Then I will commit that file, and let’s see the hashes of both blobs.

> cp A C

> git add C

> git commit -m "Third commit"

> git cat-file -p HEAD

parent 720ae00f0a865e56bf3ca39062cc7d3c192394bb

author *** <

committer *** < tree 948e4085e60fb71230e7e4fae2a9edbf6c7af780parent 720ae00f0a865e56bf3ca39062cc7d3c192394bbauthor *** < ***@gmail.com > 1528924849 +0100committer *** < ***@gmail.com > 1528924849 +0100 Third commit

So again we have a new tree, let’s see the contents:

> git cat-file -p 948e4085e60fb71230e7e4fae2a9edbf6c7af780 100644 blob 791b4c25c92cf5bd6c5a7a05343132f69ed3ff3d A

100644 blob a091a72e3cf3e81b8d0b0bc0c19abee7aab03fe0 B

100644 blob 791b4c25c92cf5bd6c5a7a05343132f69ed3ff3d C

The content is the same, so there isn’t a need to create another object to store C. Let’s see the content of the previous commit main tree:

> git cat-file -p e30ead0c3157e6e2a355e49cdb76ec67f6542d21 100644 blob 791b4c25c92cf5bd6c5a7a05343132f69ed3ff3d A

100644 blob a091a72e3cf3e81b8d0b0bc0c19abee7aab03fe0 B

Git don’t waste any space, it creates new objects when it has to create.

Hope you liked it,

Stupid Gopher