Economic development was investigated by Hidalgo, Klinger, Barabási, and Hausmann(2007)[1] in a novel way. They analysed the exports for every country and created a graph relating exported products. This technique offers new visual and analytical tools to study economic development. When reading this paper, I wanted to replicate the analysis to get a deeper understanding on the decade of 1990 in my country, Argentina. The economic policy of this decade is blamed for the infamous crisis of 2001. In this post I’ll explain how I replicated the paper — which was a straightforward task — and I’ll do some comments comparing the year 1990 and the year 2000 in Argentina. You can find the Python code for this article is here.

The Model

This model uses data from international trade[2]. To create the objects of the model, we have to follow three steps:

For each country compute its Revealed Comparative Advantage (RCA). This will give a value for each pair country-product. If for a country c and a product i, the RCA is bigger than 1, this shows that this country c significantly exports the product i—has revealed comparative Advantages.

Using the RCA computed for all countries and products, we compute a distance between products: phi(r,s). This distance selects all the countries exporting product s, and compute the proportion of them which also export product r. Those countries export product r and product s in tandem. Probabilistically, we said that conditioned to a country exporting product s (RCA(s) >1), which is the probability for this country to export product r? These probabilities forms a square matrix, phi, which relates every product with all the others products. To compute this matrix the authors made it symmetric, computing the minimum value between the probability of export r conditioned to export s, and the probability of export s conditioned to export r. Additionally, this operation avoids technical issues that arise when a acountry is the only exporter of a given product.

matrix phi: distance between exported products

Finally, they needed a way to analyse this (sparse) matrix. We can view this matrix as a graph, where every product is a node and every cell represents a weighted edge. Now, we need a subgraph of this graph with all the nodes. To do so, we start with a set with some node (any node). Then, we add a neighbor not presented in the set using the edge with the highest weight. If we perform the described algorithm until all the nodes are in the set, the result will be a tree. In particular, this tree will be the maximum spanning tree.

The previous tree is the most important object of the work described. It has very interesting economics meaning. From its definition, we know that products that are close in the tree are more likely to be developed together — in the same country — .