February 01, 2013

Libgit2 aims to make it easy to do interesting things with git. What’s the first thing you always do when learning git? That’s right, you clone something from GitHub. Let’s get started, shall we? Let’s get some of the boilerplate out of the way:

# include "git2.h" # include <stdio.h> int main ( int argc , char * * argv ) { const char * url , * path ; if ( argc < 3 ) { printf ( "USAGE: clone <url> <path>

" ) ; return - 1 ; } url = argv [ 1 ] ; path = argv [ 2 ] ; return do_clone ( url , path ) ; }

What does the do_clone method look like? Let’s start simple:

static int do_clone ( const char * url , const char * path ) { git_repository * repo = NULL ; int ret = git_clone ( & repo , url , path , NULL ) ; git_repository_free ( repo ) ; return ret ; }

git_clone takes some information, and fills in a pointer for us with a git_repository object we can use to do all manner of unholy things. For now, let’s ignore the repository itself, except to be good citizens and release the memory associated with it.

That NULL parameter? That’s for a git_clone_options structure, which defaults to some sensible stuff. The way our code is written right now, these two commands will have the same results:

$ ./clone http://github.com/libgit2/libgit2 ./libgit2 $ git clone http://github.com/libgit2/libgit2

… except that git tells you what it’s doing. Let’s fix that.

One of the things you can do with git_clone_options is have libgit2 call a function when there is progress to report. A typical callback looks like this:

static void fetch_progress ( const git_transfer_progress * stats , void * payload ) { int fetch_percent = ( 100 * stats -> received_objects ) / stats -> total_objects ; int index_percent = ( 100 * stats -> indexed_objects ) / stats -> total_objects ; int kbytes = stats -> received_bytes / 1024 ; printf ( "network %3d%% (%4d kb, %5d/%5d) /" " index %3d%% (%5d/%5d)

" , fetch_percent , kbytes , stats -> received_objects , stats -> total_objects , index_percent , stats -> indexed_objects , stats -> total_objects ) ; }

That stats object gives you lots of useful stuff:

The number of objects transferred over the network

The number of objects that the indexer has processed

The total number of objects expected

The number of bytes transferred

So let’s rewrite our do_clone function to plug that in:

static int do_clone ( const char * url , const char * path ) { git_repository * repo = NULL ; git_clone_options opts = GIT_CLONE_OPTIONS_INIT ; int ret ; opts . fetch_progress_cb = fetch_progress ; ret = git_clone ( & repo , url , path , & opts ) ; git_repository_free ( repo ) ; return ret ; }

If you run this now, the program will tell you what it’s doing! You can watch the network transfer happening, and notice that the indexer is doing its job at the same time.

[...] network 73% ( 7 kb, 51/ 69) / index 71% ( 49/ 69) network 75% ( 7 kb, 52/ 69) / index 72% ( 50/ 69) network 76% ( 7 kb, 53/ 69) / index 73% ( 51/ 69) network 78% ( 7 kb, 54/ 69) / index 75% ( 52/ 69) [...]

If you try this with a large repository, you’ll notice a significant pause at the end. All the data has been moved, what’s going on? It turns out that doing a checkout can take a non-trivial amount of time. It also turns out that libgit2 will let you report that progress as well!