I have been trying to figure out a way of visualising the strength of the connections between clusters in the news network, something that isn’t particularly intuitive by looking at the full network map. Sure, it’s divided by colours, and if you followed all the lines you’d probably figure out the intensity of connections between the individual clusters but I was looking for another way of describing the relationship.

I came across a type of diagram called a chord diagram. This is specifically for plotting relationships between elements. I’ve seen them quite a lot in various data journalism projects, and they look striking but I think also convey strengths of connections in a really intuitive way.

The first problem was quantifying the relationships in a meaningful way. The first thing I tried was plotting all the connections for the individual cities, but with over 1,300 links it made a meaningless, overly-complex visualisation. My analysis has identified seven distinct clusters within the network, and so I decided to try and plot the sum of the relationships within each of these.

Each connection has a weight, which is just the number of times those two places are connected, and I created a table with each city sorted by its cluster, then made a sum of the weight of each pair of clusters.

Ugh, that sounds confusing. Let’s say the table contained the following links:

Venice (cluster 1) -> Constantinople (cluster 1) – weight 3

Venice (cluster 1) -> Augspurg (cluster 2) – Weight 2

Cologne (cluster 2) -> Venice (cluster 1) – Weight 2

Leiden (cluster 2) -> Constaninople (cluster 1) – Weight 2

Paris (cluster 3) -> Rome (cluster 1) – Weight 4

That gives a matrix that looks like the following:

Cluster 1 -> Cluster 1 – weight 3

Cluster 1 -> cluster 2 – weight 2

Cluster 2 -> cluster 1 – weight 4

Cluster 3 -> cluster 1 – weight 4

And so on and so forth, with all 700 or so unique connections, for all 7 clusters, a total of 21 possible combinations. Then I used the package ‘circlize’ in R to make the chord diagram (and ‘cairo’ to produce an anti-aliased version).

A chord diagram plots this above matrix. This first diagram is a plot of all possible combinations, including the clusters’ connections to themselves (so cities sending information to other cities within its cluster). Not surprisingly this is overwhelmingly the most common type of connection.

It is interesting in its own right: it shows us the extent of the clustering in the network and points to this idea of ‘the strength of weak links’, a theory which states that the most important bits of the network are the connecting nodes – the thin lines stretching out here to other parts of the world are the reason you can hop from one city to another with relative ease. Interconnectivity relies on a small number of tenuous connections; sever these and the whole thing falls apart.

I also plotted another version, with the clusters’ connections to themselves taken out:

This tells another story – that of the interconnectedness of the individual clusters. We can see that Central Germany is particularly well connected outside its own cluster, and Venice, a hugely dominant point on a network map, shrinks right back. It’s weirdly insular – although it connects to a huge number of places in the East, it doesn’t have the same kind of variety some other parts of the network have. It shows you can be a hub but also kind of insignificant in the wider picture at the same time.