2013-08-13 Graphs: a Balancing Act

A while ago, me and some friends wrote a C++ tool to generate and visualise graphs, and I was surprised at how easy it is to “balance” graph vertices so that they are laid out in a nice way. This tutorial reproduces a version of the algorithm in Haskell, using the gloss library to get the graph on the screen. Apart from gloss nothing outside the Haskell Platform is needed.

This tutorial is aimed at beginners, and only a basic knowledge of Haskell is required—we disregard performance in favour of simple code. Here is a preview of the result:

Preliminaries

We import the libraries we need, qualifying Map and Set avoiding clashes with the Prelude .

The idea

First, let’s frame the problem we want to solve. We have an undirected graph, and we want to position its vertices on a surface so that the result is pleasant to look at. “Pleasant to look at” is still a very vague requirement depending on fuzzy things like human taste, and in fact there are many ways to go at this problem.

We will gain inspiration from physics, and take vertices to be like charged particles repelling each other, and edges to be like elastic bands pulling the vertices together. We will calculate the forces and update the positions in rounds, and hopefully after some time our graph will stabilise. With the right numbers, this gives surprisingly good results: clusters of vertices are held together by the numerous edges between them, while sparsely connected vertices remain distant, reducing clutter.

The Graph

We need some kind of identifier for our vertices, we will simply go for Int . An Edge is a pair of Vertex s.

We want to store our graph so that the operations we need to execute are as natural as possible. Given the algorithm outline given above, we need to do two things well: iterating through all the vertices, and iterating through the neighbours of a given vertex. With that in mind, the simplest thing to do is simply store the graph as the set of neighbouring nodes for each Vertex :

When we add a vertex, we make sure that a set of neighbours exist for that vertex. In this way adding existing vertices will not modify the graph.

When we add an Edge , we first make sure that the vertices provided are present in the graph by adding them, and then add each vertex to the other vertex’s neighbours.

vertexNeighs unsafely gets the neighbours of a given Vertex : the precondition is that the Vertex provided is in the graph.

This is all we need to implement the algorithm. It is also useful to have a function returning all the edges in the Graph so that we can draw them. Set.foldr and Map.foldrWithKey are equivalent to the usual foldr for lists, with the twist that with a Map we fold over the key and value at the same time. Since the graph is undirected, we “order” each edge so that the the vertex with the lower id appears first: in this way we will avoid duplicates like (1, 2) and (2, 1) .

The Scene

Now that we have our graph, we need a data structure recording the position of each point. We also want to be able to “grab” points to move them around, so we add a field recording whether we have a Vertex grabbed or not. We also make use of gloss ViewState , which will let us implement panning, rotating, and zooming in an easy way.

Then two predictable operations: one that adds a Vertex , with its initial position on the scene, and one that adds an Edge . When adding the Edge , we need both points to be already present—see the invariant for Scene . We cannot simply add the vertices like we do in addEdge because we need their positions.

It is also useful to have an helper to get the position of a Vertex .

Drawing

Now we can write the functions to convert the Scene to a Picture . Thanks to gloss , this is extremely easy: we are offered a simple data type that gloss will use to get things on the screen.

Some constants:

Drawing a Vertex is simply drawing a circle. We use ThickCircle to get the circle to be filled instead of just an outline.

Drawing an Edge is drawing a Line .

Bringing everything together, we generate Picture s for all the vertices and all the edges, and then combine those with the appropriate colours. Moreover we get the ViewPort in the ViewState —which stores the current translation, rotation, and scaling—and apply it to the picture.

Balancing

Now to the interesting part, the code necessary to balance the graph. As mentioned, we have two contrasting forces. Each vertex “pushes” all the others away, and each edge “pulls” together the connected vertices.

First we define a function for the “pushing” force, resulting from the charge of the vertices. Predictably, the force will be inversely proportional to the square of the distance of the two vertices. Graphics.Gloss.Data.Vector defines

type Vector = (Float, Float)

and also a Num instance for Vector , which means that we can take advantage of vector subtraction to easily get the distance and the direction of the force.

The charge of each particle has been determined empirically to give good results—increasing it will lead to a more “spaced out” graph, decreasing it a more crowded one. mulSV lets us multiply Vector s by scalars, magV lets us get the magnitude of a vector (in this case the distance). Varying the charge will determine how far apart the vertices will be.

For what concerns the force that pulls connected vertices together, it will be proportional to the distance of the two vertices, so we can take the distance vector directly and multiply it by the stiffness, although this time ve have the vector point in the other direction, since this force brings the vertices together.

We can then write a function to get the velocity of a Vertex in each round:

We bring everything together by calculating the new position for each vertex. We do not move the vertex that is currently selected by the user, if there is one.

User interaction

When a user clicks to grab a point, we need to check if she has caught something. Thus we define inCircle to check if the a point is inside the drawn version of a vertex.

findVertex iterates through all the vertices and returns one if the position where the user has clicked is in it.

User input will come in the form of Event s, a gloss data type that represents key or mouse button presses, and mouse motion. Thus we define handleEvent to process an Event and a Scene producing a new Scene :

We want the user to be able to grab vertices. Since the default configuration for the ViewState —which we are using—already uses the left and right mouse button for its actions, we require the user to press Ctrl and click:

invertViewPort “undoes” the rotation, translation and scaling applied by the ViewPort to the picture, so that we can map user input to the coordinates that scPoints refers to.

When the user releases the left mouse button and a vertex is selected, we deselect it:

When the user moves the mouse and a vertex is selected, we move the vertex where the cursor is:

When none of the above apply, we pass the event to the ViewState , which will handle the panning, rotating, and zooming.

Running

Finally, we put the code above to good use. We will use a sample graph to draw:

Then an utility function fromEdges initialises a scene from a list of edges randomising the positions of the vertices in the initial window size:

Finally, we use the play function provided by gloss to make everything work. The important arguments in play are the last two functions, which update the state of the world after a user event and after a time step, respectively. In our case handleEvent and updatePositions will do the job, our world being a Scene .

Then all its left to do is to initialise the Scene and run sceneWindow .

Improvements

The code provided is a good starting point for many improvements, here we give some suggestions.

Performance The code does not scale well for big graphs, for a number of reason. QuadTree/Voronoi diagram: Currently our algorithm is cubic: for each vertex we go over all the other vertices for the push forces and over all the neighbours for the pull forces. It can be made much faster by approximating distant clusters of vertices to a single particle with higher charge. An easy way is to subdivide recursively the space into squares, a goal achievable by storing the graph in a QuadTree. Then squares that are far enough are deemed as one entity. A more precise but also more expensive way is to subdivide the space in a more irregular way depending on the disposition of the vertices, for example in what is called a Voronoi diagram. Arrays: Currently, once a graph is loaded, it stays the same forever. This considered, using Map is quite a waste: we can utilise a structure with much better performance to store the graph, such as an Array or a Vector . The best option would probably be an unboxed Vector .

Functionality Weighed edges: We can easily adapt the algorithm to work with weighed edges by adjusting the stiffness of each edge depending on the weigh. For example if the weigh represents the distance between two connected nodes, the stiffness will be inversely proportional to the weigh, so that closer vertices will indeed end up being closer. Generating graphs: Generating realistic graphs is an interesting and useful challenge. It turns out that many real networks, such as friendships and the web, share certain characteristics. Such networks are known as small-world networks, and various algorithms to generate them are available. 3D: The algorithm can be trivially extended to the 3rd dimension—in fact given the right Num instances it will work in automatically, and with some type class trickery in any dimension. The hard part would be drawing the graph, since gloss does not go beyond 2 dimensions, and raw OpenGL is so much uglier. dot files: The program could be enhanced with a parser for dot or similar format, so that experiments could be ran on existing graphs.



If you implement any of the above in a nice way, let me know!

As usual, comments on Reddit.