Computing the network metrics

Pre-requirements

A lightning node running LND A working version of Python (I use Python 3.7.5) The following python libraries: Pandas, Numpy, graph-tool, NetworkX Optional: Jupyter Lab

Note: practically everything can be done either just with NetworkX or just with graph-tool, however I have found graph-tool to be much much faster for some calculations as it is written in C with a Python wrapper. On the other hand, NetworkX feels more python-like and you might find it easier to use.

Note 2: graph-tool can’t be installed using pip and you will have to install the dependencies by hand or using a package manager and then compile it yourself. It can be a tedious process and the compilation takes a long time so go make yourself a nice cup of coffee.

Note 3: for simplicity’s sake, we assume the graph is undirected. However, both these libraries can deal with directed graphs too. It just needs a little bit more processing when converting the JSON graph to another format. If you would like me to make another post going over that process let me know!

Creating our project environment

Let’s first create a project directory where we will save the graph data:

$ mkdir ln-graph-stats

$ cd ln-graph-stats

Then, we need to create a virtual environment for it (optional, recommended).

$ python3 -m venv ln-venv

Tip: If you are going to use graph-tool, I recommend you install it in your system packages instead of installing it in your virtual environment, that will save you a lot of hassle. In that case, run the following command instead (after having installed and compiled graph-tool):

$ python3 -m venv ln-venv --system-site-packages

Finally activate your virtual environment and we are good to go!

$ source ln-venv/bin/activate

Getting the graph data

To get the graph data we first need to ensure we have an instance of LND up and running. Once it’s done, open a terminal tab and run the following command:

$ lncli describegraph > /full/path/to/ln-graph-stats/lngraph.json

Reading the graph data in python

The graph data will be in json format. This is great for storing the data, but it is not great for processing it. Therefore we will want to parse it into a format that we can work with. I will show you how you can parse the json into: a Pandas DataFrame, a NetworkX graph and a Graph-Tool graph. Each presents its own advantages balancing speed and conveniency and I will let you decide which you prefer to work with. I will always specify which I am using for each statistic.

Panda DataFrame

NetworkX Graph

Graph-Tool Graph

And now, the metrics!

Okay, great! We’ve got the option to save our graph in either of three different formats. Each format has its own advantages and which one you use highly depends on the type of analysis you wish to do. For the sake of this practical guide I will showcase the use of all three of them for different metrics calculations. However, as I mentioned above, you are free to pick between Graph-Tool and NetworkX depending on whether you prefer speed or ease of use. Keep in mind Graph-Tool might not have all the functions NetworkX offers either.

Average and quantiles

If you check out the BitcoinVisuals website, you will notice most of the statistics show the average, as well as the following quantile values: 0.9, 0.5 and 0.1. The average is sometimes misleading and the quantile values can give a better idea of the actual distribution of values across the network. Since we will be reproducing the average/quantile for most statistics, you can make use of this useful little function to save some time and lines of code.

Nodes (Panda DataFrame)

I decided to group some of the metrics together in a way that made most sense practically and code-wise. So, in this bit we will have a look at all the metrics related to the network nodes that don’t require any fancy graph library. Just some good old panda data frames. We will compute the total number of nodes both with and without channels, the number of channels per node and the total capacity per node. Also, since we will be looking at the channel policies anyways, we will compute the total number and percentage of enabled channels per node.

The following function will add the following columns to our graph data frame: ‘num_enabled_channels’, ‘num_channels’, ‘percent_enabled_chan’ and

‘total_node_capacity’, which is all we need to compute the statistics mentioned above.