Just as in the previous posts, I’ll leave the rigorous maths for the companion notebook, and stick to explaining the general idea. Just like Self-Organising Feature Maps, GNGs are iterative algorithms. However, unlike SOFMs, they do not require any initial specification of the number of neurons — as the name suggests, GNGs are growing, and new neurons keep getting added as long as the algorithm is running.

Each iteration begins by picking a data point from the training set. Because GNGs generalise very well to an arbitrary number of dimensions, it is common to speak of these in terms of 𝛿-length vectors v. The neuron nearest to v, called the best-performing unit (BPU) in analogy to SOFMs, is moved closer to v. All neurons directly connected to the BPU are also moved closer to v. Determine the second-best performing unit (SBPU). If the BPU and SBPU are connected, set the age of this connection to zero. If they are not connected, connect them. Then increment the age of all other edges emanating from the BPU. If an edge has an age larger than the maximum age Amax, delete the edge. If this results in ‘orphan neurons’ (neurons with no edges connecting them), these are also deleted. Every λ iterations, the neuron with the largest cumulative error (sum of distance from each data vector v over each iteration) is identified as the worst-performing unit (WPU). Insert a new neuron halfway between the WPU and its worst-performing neighbour and delete the original edge between the WPU and its worst-performing neighbour. Iterate until some boundary condition, such as maximum number of iterations is reached.

It’s simple to understand how this algorithm works, but it’s worth spending some time on thinking about why it works. As you might have noticed from the examples above, this method creates a partitioning of the space where the data is distributed, and does so by approximating a Delaunay triangulation (indeed, in his original paper, Fritzke referred to the graph generated by a GNG as an ‘induced Delaunay triangulation’). The idea of a growing neural gas algorithm is that unlike a SOFM, which requires some idea of how many neurons are required to represent the data, GNG determines where the model has been performing worst so far, and refines that area. This eventually results in a model that grows not uniformly but rather to expand the size of the graph where it can no longer cover (quantise) the data with the given resolution (number of neurons).

Using GNG to count clusters

In the first introductory Part to competitive neural networks, I have already introduced a use case for GNGs, namely as quick and efficient vector quantisation algorithms that create decent approximations of images. In the following, we’ll be looking at something slightly different, namely counting distinct objects and quantifying their sizes.

Hard and soft exudates on a fundoscopy image from the DIARETDB1 data set (Kauppi et al., 2007).

The DIARETDB1 data set by the research group of Kauppi et al. at Lappeenranta University of Technology contains 89 digital fundoscopy images, that is, images of the fundus of the eye, of five healthy volunteers and 84 people with some degree of diabetic retinopathy. In diabetic retinopathy, a complication of diabetes that affects the small blood vessels of the retina, long term inadequate blood glucose control leads to vascular damage, microaneurysms and exudates, where lipids (causing bright yellow hard exudates) or blood (resulting in pale, diffuse yellow soft exudates) have accumulated on the fundus. In the following, we’ll be using GNG to quantify these abnormalities. The DIARETDB1 data set contains ROI (Region of Interest) masks, but those merely outline areas that show a particular clinical feature. Can we use Growing Neural Gas to count how many clusters of hard exudates are present in the regions of interest? You bet!

Isolating the ROI using consensus masks: the consensus mask of at least two experts’ votes (bottom right) is generated from the original ROI annotations (bottom left). This mask is used to isolate the region of interest from the fundoscopy image (top left), resulting in a masked image (top right).

We begin with some image processing, namely by refining the area of interest. Each image was labelled by four experts, which created a mask. We can threshold the mask so as to require consensus by a given number of experts, a trick widely used in annotated research imagery (scroll to the bottom if you’re unfamiliar with it!). Then, we use the relatively prominent bright yellow colour of hard exudates to convert them to data points the GNG can begin to characterise (for the nitty-gritty, do refer to the companion notebook, where some of the added tricks, including some morphological transforms, are explained).