"Git good" - anonymous



Since everyone these days is becoming a developer, there is a high chance that you are one of them. Even if you are not, you still probably use a version control system in your day to day work (if not...whats wrong with you!?), with the most popular option being Git. You know the basics (i.e. git add, git commit, git push, git pull) and even the more advanced stuff (i.e. git reset, git stash, git checkout), but today I will go into four more unknown commands that can actually be very useful. Why four? Because everyone stops at three (Sherlock reference right there). By the way, in case you are from Mars, the above image is from xkcd.

Git bisect

If you are in a well organized development environment, this command should be almost useless to you. But it could happen that you are just working on a small project at home, or your company is still figuring out your continuous integration setup, or even that your dog ate your tests and the following situation arises.



You are happily developing cool new features for your application when news start coming to you about something not working properly in the released version. You try to see what's wrong in the code, but cannot figure it out, so you decide to find the commit which introduced the bug. How do you do that? Well here is where your old friend binary search comes to help and git also gives you a hand with the git bisect command.

The idea is simple. You pick a commit from history, using git log, where you know for sure that your app was free of this particular bug. Then you use git bisect to do a binary search for the commit that introduced the bug. If this sounds too abstract, let's look at an example.

Let's say we have passionate people who want to save awesome songs in git so they last for posterity. Being well educated, they commit often, so one verse at a time. The latest project was Candy Mountain on which the team has worked on. They kept adding verses and in the end everything looked good. This is the latest version present in the git repo:

[bdl@composer]$cat amazing_lyrics Oh! When you're down and looking for some cheering up Then just head right on up to the candy mountain cave When you get inside you'll find yourself a cheery land Such a happy and joyful and perky, merry land They've got lollipops and gummy drops and candy things Oh so many things that will brighten up your day It's impossible to wear a frown in candy town It's the mecca of love, the candy cave They've got jelly beans and coconuts with little hats Candy rabbits, chocolate bunnies, it's a wonderland of sweets Ride the candy train to town and hear the candy band Candy bells it's a treat as they march across the land Cherry ribbons stream across the sky into the ground Turn around It astounds! It's a dancing candy tree In the candy cave imagination runs so free So now Charlie, please will you go into the cave

Then you think that you should evolve your setup with some tests. You develop the amazing check_song.py software (contact me if you want a copy), which checks if the lyrics match with the original, even if they are incomplete. You run it for the first time against the song version in the repo and you see the following:

[bdl@composer]$ python check_song.py amazing_lyrics WOW!! LYRICS DO NOT MATCH WITH THE ORIGINAL!! YOU ARE RUINING A GREAT SONG.

Disaster!!! There is an error in the lyrics, even tough looking at them they seem ok. Going one by one through each commit, trying to figure out where the error was introduced, is going to take a long time. Luckily we heard about git bisect recently from an awesome article.

We checkout the first commit and verify that it is correct(in general select a commit that you know for sure does not have the bug you are looking for). For convenience we tagged our first commit as first_verse and our latest commit as bad_verse:

[dragos@composer]$git checkout first_verse [dragos@composer]$python check_song.py amazing_lyrics GOOD JOB! THE SONG IS INTACT

Now we know our guilty commit is somewhere in between these ranges and we are ready to start the search:

[dragos@composer]$git bisect start [dragos@composer]$git bisect good first_verse [dragos@composer]$git bisect bad bad_verse Bisecting: 8 revisions left to test after this (roughly 3 steps) [9a2b4ea392d81192b3f0971dc9e0388b0827f10d] Adding amazing ninth verse

You see what happened? After giving the two commits between which we want to search (here we used two tags which point to those specific commits), git automatically checked out the commit in the middle of the range. Now we have to check if the buggy verse is still here and inform git about it.

[dragos@composer]$python check_song.py amazing_lyrics GOOD JOB! THE SONG IS INTACT [dragos@composer]$git bisect good Bisecting: 4 revisions left to test after this (roughly 2 steps) [eada803203224f16fb0760255575473ff3353cbc] Adding amazing thirteenth verse

With this info, git did a checkout on the commit halfway between this good commit and the last commit...we just got rid of half of our commits that we had to search in. We repeat the process until we find the commit that introduced the error.

[dragos@composer]$python check_song.py amazing_lyrics WOW!! LYRICS DO NOT MATCH WITH THE ORIGINAL!! YOU ARE RUINING A GREAT SONG. [dragos@composer]$git bisect bad Bisecting: 1 revision left to test after this (roughly 1 step) [4b9da3dddf2dcaeb11f3e64746e67cf40882e507] Adding amazing eleventh verse [dragos@composer]$python check_song.py amazing_lyrics WOW!! LYRICS DO NOT MATCH!! YOU ARE RUINING THIS GREAT SONG. [dragos@composer]$git bisect bad Bisecting: 0 revisions left to test after this (roughly 0 steps) [634f0ad38519901d7db8885ed67a9c4122f7d756] Adding amazing tenth verse [dragos@composer]$python check_song.py amazing_lyrics GOOD JOB! THE SONG IS INTACT [dragos@composer]$git bisect good 4b9da3dddf2dcaeb11f3e64746e67cf40882e507 is the first bad commit commit 4b9da3dddf2dcaeb11f3e64746e67cf40882e507 Author: bad_person <bad.person@mail.com> Date: Fri Jan 13 00:23:04 2017 +0100 Adding amazing eleventh verse :100644 100644 6b805f5880efc6b3aca886c33b892a3c54c69bcd ab327e224638a8bb744a78d9c1985c80bdb8e852 M amazing_lyrics

In the end we found the guilty commit so now we can see what was the error:

[dragos@composer]$git bisect reset 4b9da Previous HEAD position was 634f0ad... Adding amazing tenth verse HEAD is now at 4b9da3d... Adding amazing eleventh verse [dragos@composer]$git diff HEAD^ HEAD diff --git a/amazing_lyrics b/amazing_lyrics index 6b805f5..ab327e2 100644 --- a/amazing_lyrics +++ b/amazing_lyrics @@ -7,4 +7,5 @@ Oh so many things that will brighten up your day It's impossible to wear a frown in candy town It's the mecca of love, the candy cave They've got jelly beans and coconuts with little hats -Candy rats, chocolate bats, it's a wonderland of sweets +Candy rabbits, chocolate bunnies, it's a wonderland of sweets +Ride the candy train to town and hear the candy band

Aha! So someone did not like candy rats and chocolate bats. Now we know. We are free to fix this small error in a new commit.

In conclusion, this is useful when you know the behavior that changed, but not what code caused it.

Git blame

This is maybe the most known command on the list, but since I met a lot of people who are not aware of it, I decided to include it.

Git blame annotates your files with information about each line. You are shown the hash of the latest commit that modified the line, the author and the date.

As an example we can take the file from our previous section:

[dragos@compser]$git blame amazing_lyrics ^d5c5a92 (bdl 2017-01-13 00:06:29 +0100 1) Oh! When you're down and looking for some cheering up fa76d21a (bdl 2017-01-13 00:07:17 +0100 2) Then just head right on up to the candy mountain cave bab5fc16 (bdl 2017-01-13 00:07:58 +0100 3) When you get inside you'll find yourself a cheery land b96e74ec (bdl 2017-01-13 00:08:50 +0100 4) Such a happy and joyful and perky, merry land aa5c8716 (bdl 2017-01-13 00:12:05 +0100 5) They've got lollipops and gummy drops and candy things 2b26aaf7 (bdl 2017-01-13 00:14:48 +0100 6) Oh so many things that will brighten up your day 7e06c9bb (best_guy 2017-01-13 00:15:45 +0100 7) It's impossible to wear a frown in candy town 104cfd46 (best_guy 2017-01-13 00:16:11 +0100 8) It's the mecca of love, the candy cave 5e259be9 (bdl 2017-01-13 00:16:53 +0100 9) They've got jelly beans and coconuts with little hats b743274c (bad_person 2017-01-13 00:23:04 +0100 10) Candy rabbits, chocolate bunnies, it's a wonderland of sweets b743274c (bad_person 2017-01-13 00:23:04 +0100 11) Ride the candy train to town and hear the candy band 896d4c44 (bdl 2017-01-13 00:25:32 +0100 12) Candy bells it's a treat as they march across the land 6e788500 (bdl 2017-01-13 00:27:23 +0100 13) Cherry ribbons stream across the sky into the ground 33bbe8ef (bdl 2017-01-13 00:28:02 +0100 14) Turn around 1efad606 (bdl 2017-01-13 00:29:47 +0100 15) It astounds! 9352d03d (bdl 2017-01-13 00:30:45 +0100 16) It's a dancing candy tree 140da6b7 (random_guy 2017-01-13 00:31:17 +0100 17) In the candy cave imagination runs so free eb97f1c2 (bdl 2017-01-13 00:31:52 +0100 18) So now Charlie, please will you go into the cave

Looking at this, we could notice that there are two lines modified by the same commit, which doesn't conform to our "one verse, one commit" policy. This could make us curious and look at the commit and find out that was where our error was introduced.

Another use is if you just don't understand what a piece of code does, you can look up who committed it and go ask.

Fun fact #1: If a commit hash has the ^ sign in front of it, then the associated line is unmodified since the creation of the file.

Fun fact #2: Git can track even line movements across files. So if you decide to refactor a big code or configuration file in multiple smaller ones and you git blame the smaller files, then git will show you the original commit in the big file and the name of the big file. All this by just adding the -C option.

Git reflog

Probably you were told to avoid using git reset --hard because it is a destructive operation. That is because it does not only modify the HEAD(--soft) and the index(--mixed), but also your working directory. If you had uncommitted work, then you are screwed.

What you may not know is that there is (mostly) nothing to worry about if you have everything committed and by some circumstances you do a hard reset. That is because git reflog has your back. Of course this is just an example and it can be useful at any times when you have lost commits and you want to recover them.

Reflog is a local structure that records where your HEAD and branch references have pointed to in the past. This is not shared with anyone else, everyone has their own reflog. It is important to mention that it does not record information forever, after a configurable amount of time information will be removed from the reflog.

Example time!!

Let's look at our current commit history.

[dragos@composer]$git log --pretty=oneline 8ef8a6b18d4c76497a0cd5b0104336f8253d33a4 first commit of checking software eb97f1c20d82226f4d04e96237bf6956fa09e345 Adding amazing eighteenth verse 140da6b784e33367e200349e6da5c0ddc48e7ef8 Adding amazing seventeenth verse 9352d03d455b4c646f3de91abd2aacd2998f34b8 Adding amazing sixteenth verse ...

Just as a reminder: what git reset does is that it moves the pointer of the current branch(and the HEAD implicitly) to a new commit, updating the index if --mixed(this being the default if no flag is specified) is used or updating the index and the working directory if --hard is used (if still troubled by it, an excellent explanation can be found here). For some reason we are not satisfied with the latest commit so we will reset to the previous one(you can reset to any commit by specifying some sort of reference to it)

[dragos@composer]$git reset HEAD^ [dragos@composer]$git log --pretty=oneline eb97f1c20d82226f4d04e96237bf6956fa09e345 Adding amazing eighteenth verse 140da6b784e33367e200349e6da5c0ddc48e7ef8 Adding amazing seventeenth verse 9352d03d455b4c646f3de91abd2aacd2998f34b8 Adding amazing sixteenth verse 1efad606e509a3eca835678ebf2e9c2fffaae45c Adding amazing fifteenth verse ...

We notice that the latest commit from our previous git log output is not present anymore. But it was not deleted, it's just not visible anymore in the normal functionality of git log, because by default it just walks the commit ancestry chain.

We realize that actually the commit we reset was actually perfect and we want to restore it. But how? Well we can just use git reset and make our branch point to that commit again, but how do we get the reference to that commit, since it does not appear in git log anymore. Here is where git reflog steps in.

[dragos@composer]$git reflog eb97f1c HEAD@{0}: reset: moving to HEAD^ 8ef8a6b HEAD@{1}: commit: first commit of checking software eb97f1c HEAD@{2}: checkout: moving from d5c5a925da0dc687e8a7d5d40111bbd930fd5a02 to master d5c5a92 HEAD@{3}: checkout: moving from master to first ...

The list goes longer, but for our case we are only interested in the top two entries. From the output we can see to which commit was HEAD pointing previously and where it is pointing now (indicated by HEAD@{0}). So to move our branch back to how it was before we can use either directly the desired commit hash (i.e. 8ef8a6b) or a reference to it (i.e. HEAD@{1}). Be careful and realize that if we move our HEAD around then HEAD@{1} will most probably point to some other commit. So let's reset the reset:

[dragos@composer]$git reset HEAD@{1} [dragos@composer]$git log --pretty=oneline 8ef8a6b18d4c76497a0cd5b0104336f8253d33a4 first commit of checking software eb97f1c20d82226f4d04e96237bf6956fa09e345 Adding amazing eighteenth verse 140da6b784e33367e200349e6da5c0ddc48e7ef8 Adding amazing seventeenth verse 9352d03d455b4c646f3de91abd2aacd2998f34b8 Adding amazing sixteenth verse ...

As we can see, things are back to normal. So the next time you panic because of a git reset gone wrong, don't forget you have a guardian angel out there under the name of reflog.

Fun fact #3: I said before that by default git log will only show you the commits in the ancestry chain. You can make it walk the reflog chain instead, by specifying the -g/--walk-reflogs flag.

Commit ranges

You reached the final section, congratulations for going the extra mile.

Here we will see how to specify range of commits using three methods, each useful for its own purpose. They are especially handy when managing multiple branches. Before we dive into them, below there is the picture of the commit tree on which we will be working on.

Double dot

I guess you used this one, even if you did not know fully what it actually does. Your typical usage was probably just to filter the commits that you see from the same branch, something like the following (I eliminated the hashes from the output, showing only the corresponding tags):

[dragos@composer]$git log --pretty=oneline B..F F D

So this shows the commits between the range specified without including the left-most object (i.e. B in this case). But what would happen if we try this with commits that are each on a different branch? Let's try some more examples:

[dragos@composer]$git log --pretty=oneline G..F F D [dragos@composer]$git log --pretty=oneline F..G G E [dragos@composer]$git log --pretty=oneline C..G G E [dragos@composer]$git log --pretty=oneline G..C C

If you have not deduced exactly what this does form the previous outputs then SPOILER ALERT, because I'm going to tell you. Using this notation you see all commits not reachable from the first specified commit(going only backwards of course), but reachable from the second specified commit. Please go back to the picture and previous examples to see that this makes sense.

One useful usage of this is before you want to push your changes to a remote.

[dragos@composer]$git log --pretty=oneline origin/master..HEAD

The output of the above command will represent the commits that you will upload to master on origin when using git push (Considering of course that you recently did a git pull so your remote references are up-to-date)

Multiple branches

So the double dot is all fine and cool, but what do we do when we want to specify more than two branches, such as seeing what commits are in any of several branches that aren’t in the branch you’re currently on. It turns out that the double dot notation is just a shorthand for another syntax. If you specify the ^ character or --not before any reference, then git log will show you only commits that cannot be reached from those references. To understand, the following commands are equivalent:

[dragos@composer]$git log --pretty=oneline G..F [dragos@composer]$git log --pretty=oneline ^G F [dragos@composer]$git log --pretty=oneline F --not G

And an example with multiple branches:

[dragos@composer]$git log --pretty=oneline G --not H F G

Triple dot

The last method that we are covering for specifying commit ranges is the triple dot notation. You could say that it provides an exclusive disjunction of the sets of commits reachable from two specified references. In normal words it basically specifies all the commits that are reachable by either of two references but not by both of them.

Last example and you can go on with your life:

[dragos@composer]$git log --pretty=oneline C...G 49a59058100707187ac6e8980e77e45e57785407 G 4f1615031ededc27944ed43f4fc410b69a637919 E b73244c413956a356ed5076b5b87f087d978ba2f C

Goodbye

Thats it guys, I hope you learnt at least one thing from this and also had some fun. If you want to be alerted if there is a new post press the big subscribe button underneath the comments. Don't worry I dont have time to write too much so I will not spam you.

