In this post, I’m going to introduce you to a cool-looking graph, tell you what it means, and give the technical details of its generation—all because I think America might care. Here we go.

Introduction

Politics in America are hopelessly partisan, and all of the bickering serves only to cripple our nation at a moment of crisis when decisive action is called for. You know it. I know it. Barack Obama and John Boehner know it. Your grandma knows it.

Or do we know it? The belief that American politics has become more polarized in recent decades is widespread. But is there any evidence for it? While I make no attempt to provide a complete explanation for this disturbing trend in our nation’s governance, in this post I present some work that I believe provides an answer—a resounding confirmation that, according to at least one view of the situation, the politics of the United States are now more deeply divided than ever.

Though this work was done in collaboration with Michael Dimond as part of an advanced data mining course (CS676) at BYU, I believe I am the sole author of the portions of our report excerpted below.

The Cool-Looking Graph

Here’s the pretty picture:

Bask in its glory—and be grateful, because that thing took a lot of work! Make sure to click on the image to see the full-sized version. (It will open in a new window/tab.)

What It Means

The above graph is a visual representation of the United States Senate across 222 years of legislative history. It is, in essence, a social network of senators across time—who voted like whom, what cliques and factions formed, etc. In other words, retroactive Facebook for America’s past politicians? No, that’s going too far….

Anyway, here’s how to interpret the graph. Each node (circle) represents a senator. An arc is drawn between two nodes if the two senators at the endpoints voted on the same bill at least once and voted the same way on bills more than 75% of the time. Size and color of nodes indicate their centrality (a measure of importance) in the network. Scanning from left (1789) to right (2011), a few trends emerge:

The height of the graph increases. Much of this can be attributed to the increase in the number of states, from 13 to 50, meaning the number of senators serving simultaneously increased by 74. The graph alternates between unity and polarization. Visually, unity looks like a single “stream” of nodes, whereas polarization is the graph splitting into two components that move in slightly different directions. In recent decades, the height of the graph has continued to increase in spite of the number of senators being fixed at 100 since 1959. I assert that this corresponds to the phenomenon of increased polarization between the two parties.

I am interested in whether the flow of the graph can be correlated with developments in the American two-party system. Feel free to let me know your thoughts on that. For those wishing to play with the graph data, it’s available here.

Technical Details

This stuff gets pretty computer sciencey, so only read on if you really want to nerd out.

Data

The graph is generated using an aggregated and sanitized version of the THOMAS congressional data from govtrack.us. This yields 2.1 GiB of primarily XML-encoded congressional data from the 1st to the 112th congress. The data includes a record of votes by all legislators on all roll calls since the 1st congress, as well as party affiliation.

Social Graph Inference

Let be the set of all legislators and be the set of all sessions of congress. We define a legislator-to-legislator similarity function that returns a similarity score for all pairs of legislators that ever voted on the same roll call:

where

returns the set of all roll calls (votes) occurring in session ;

returns the set of all roll calls (votes) occurring in session ; is an indicator function returning 1 when is true, 0 otherwise;

is an indicator function returning 1 when is true, 0 otherwise; returns the vote cast by legislator on roll ; and

returns the vote cast by legislator on roll ; and is true iff legislator served in congressional session .

We use this similarity measure to construct a legislator affinity graph as follows:

Let be an undirected graph with a set of vertices and a set of weighted edges , such that

and

and

where

yields the vertex associated with a given legislator ;

yields the vertex associated with a given legislator ; yields an undirected edge with weight and endpoints and ,

yields an undirected edge with weight and endpoints and , and is a minimum similarity threshold.

Rendering

In practice, the above must be set high (I used 0.75) to prevent the number of edges from being excessively large. Once the graph was constructed, it was loaded into Gephi, a graph visualization tool. Betweenness centralities were computed, nodes were sized and colored, and a force-directed layout algorithm was applied. I then manually rotated the graph so that earlier senators are located on the left and more recent senators on the right, to give the effect of a rough historical timeline. I exported this as an SVG file, then loaded it in the Inkscape vector graphics program. With the benefit of 16GB of RAM, I coaxed Inkscape into rendering a 20,000 pixel width PNG image of the graph. This was finally scaled to 10,000 pixels wide for web distribution using GIMP.

Acknowledgements

Thanks to Christophe Giraud-Carrier for teaching the class for which this graph was generated, and to Michael Dimond who, though not directly working on this portion of our project, was nevertheless an excellent collaborator. And to my friend who convinced me to finally finish this post.