How to traverse a git repository using libgit2 and C++

This is a very simple C++ tutorial which explains how to use the open-source library libgit2 to iterate through a git repository.

Libgit2

This tutorial is an introduction to libgit2, which is “a portable, pure C implementation of the Git core methods provided as a re-entrant linkable library with a solid API”.

Even if libgit2 is written in C, it works with C++ out of the box and there are many language bindings available.

The program

The program I am going to describe in this tutorial wants to replicate a very basic version of the git utility command

git log --pretty=oneline

which lists commit objects in reverse chronological order showing their hash and short commit summary.

The original git-log offers tens of different options, but I want to keep things simple here.

How to traverse a git repository

Let’s have a look at the code.

#include <git2.h>

Every program using libgit2 needs to include this header file.

int main(int argc, char * argv[]) { git_libgit2_init();

The first thing to do is to call git_libgit2_init to initialise libgit2 and its resources.

const char * REPO_PATH = "/path/to/your/repo/"; git_repository * repo = nullptr; git_repository_open(&repo, REPO_PATH);

You open a repository using git_repository_open, which creates an object you can use to interact with a repository in your code.

git_revwalk * walker = nullptr; git_revwalk_new(&walker, repo);

The next step is to create a revision walker, the object which iterates through a git repository. This is done by the git_revwalk_new function.

git_revwalk_sorting(walker, GIT_SORT_NONE);

Once we have a revision walker we need to set few options to control the traversal.

A first one can be the sorting mode when iterating through the repository. This is achieved by calling git_revwalk_sorting and passing one of the following values as second parameter:

GIT_SORT_NONE – default reverse chronological order (starts from most recent) of commits as in git

GIT_SORT_TOPOLOGICAL – topological order, shows no parent before all of its children are shown

GIT_SORT_TIME – commit timestamp order

GIT_SORT_REVERSE – commits in reverse order

topological and time orders can also be combined using an OR.

I want to point out you don’t need to call this function to set GIT_SORT_NONE as that’s the default value. I only did it here to describe the function.

git_revwalk_push_head(walker);

Now it’s time to set the root (starting commit) for the traversal. At least one commit must be pushed onto the walker before a walk can be started.

Calling git_revwalk_push_head sets the root to the repository’s HEAD.

git_oid oid; while(!git_revwalk_next(&oid, walker)) {

After the setup stage we are ready to start the traversal. This is done by the function git_revwalk_next which gets the ID of the next commit to visit.

git_commit * commit = nullptr; git_commit_lookup(&commit, repo, &oid);

Once we have an ID we can retrieve a git_commit object calling git_commit_lookup. We will use this object to retrieve information about the commit.

std::cout << git_oid_tostr_s(&oid) << " " << git_commit_summary(commit) << std::endl;

This is where we get the info we need from the ID and the commit.

We can use git_oid_tostr_s to get the hash of the commit from the ID and git_commit_summary to get the short summary of the commit.

git_commit_free(commit); }

Once we have used the commit it’s time to release the object calling git_commit_free.

git_revwalk_free(walker); git_repository_free(repo);

Then we can free the revision walker calling git_revwalk_free and the repository calling git_repository_free.

git_libgit2_shutdown(); return 0; }

Finally we clean up libgit2 by calling git_libgit2_shutdown.

This is not really necessary in a program like this one as exiting the execution will free all the resources anyway. Nevertheless It’s needed in any software which keeps running after you have finished using libgit2.

Source code

The full source code of this program is available on GitHub and released under the Unlicense license.

Program output

When running the program you will get something like this:

9f0cb4b09857571fcb698917c777dd0a8837c4c1 Commit in master to fix conflict. 03a590e6bcb7abdd56f34067d496a33b1bcbec7c Commit 2 in test2. 41aac50e36cb71c12626f38793bafaa91a2da78d Commit 1 in master. 55d1c54658eaf12a32772225567ebdcbe32ff653 Commit 0 in test2. 5951cf7240f4f87c5bfc4274eb00e392aaef250a Added one more line after merge. ...

which basically is the output of git log –pretty=oneline as wanted.

How to handle errors

I decided to not handle errors in the code above to keep things simple and focused on the concepts I wanted to explain.

Obviously things are not so easy when you are writing real software as you always have to handle errors properly.

Most libgit2 functions return a negative value in case of error, so you usually want to write your code like:

if(git_libgit2_init() < 0) { // handle error } // normal execution

References

You can check out the official API reference of libgit2 to learn more about the functions described in this tutorial.

To learn how to build libgit2 and this example program check out the complete guide to linking available on the project’s website.

Conclusion

I am currently working on a project based on libgit2 and I have to admit that grasping concepts at first has not been very straightforward. It all makes sense now, but the reference and the examples provided can be a bit cryptic sometimes. When I started I wished there were more detailed tutorials and that’s why I decided to write this one.

If you are interested in seeing more examples/tutorials about libgit2 let me know leaving a comment below.

Subscribe

Don’t forget to subscribe to the blog newsletter to get notified of future posts.

You can also get updates following me on Google+, LinkedIn and Twitter.