It’s been a rocky few months for the prison system in England and Wales.

In December there was a riot at HMP Birmingham that lasted around 12 hours. Authorities had to send in Tornado teams to restore order. The bill for getting the prison back to normal, including moving inmates to other prisons and replacing locks, may run into millions of pounds.

The Birmingham riot came a month after one in Bedford while there was another disturbance a week later at HMP Swaleside, Kent.

Liz Truss, the Justice Secretary, gave a notably cautious statement to MPs after Birmingham:

The issues in our prisons are longstanding and they are not going to be completely solved in weeks or even months. We are working to ensure our prisons are stable while we deliver our reforms. Of course this is a major task. I am committed to this, so is the Prison Service, and I know that governors and prison officers are as well. The next few months will be difficult, but I am confident we can turn this situation around, we can turn our prisons in to places of safety and reform – and this is my absolute priority as Secretary of State.

The data has shown for a long time that Britain’s prisons are seriously overcrowded.

The Ministry of Justice use several different measures of population in prison. They are:

Population Capacity Certified Normal Accommodation (CNA)

Population and capacity are fairly self-explanatory. The Ministry of Justice defines the CNA on the other hand as follows:

Certified Normal Accommodation (CNA), or uncrowded capacity, is the Prison Service’s own measure of accommodation. CNA represents the good, decent standard of accommodation that the Service aspires to provide all prisoners.

In other words, you can have a population above the CNA but below the capacity. It will be overcrowded but will still be able to cram everyone in.

The monthly prison population figures are here. Let’s take a look at them.

These latest figures are for November.

> prisons <- read.csv("prisons.csv",stringsAsFactors = FALSE) > str(prisons) 'data.frame': 121 obs. of 7 variables: $ Prison.Name : chr "Altcourse" "Ashfield" "Askham Grange" "Aylesbury" ... $ Baseline.CNA : int 794 408 150 410 322 760 1093 122 545 425 ... $ In.Use.CNA : int 794 408 126 410 192 760 1093 0 479 407 ... $ Operational.Capacity : int 1133 400 128 444 258 906 1475 0 517 614 ... $ Population : chr "1098" "394" "114" "422" ... $ Pop.to.In.Use.CNA : chr "138%" "97%" "90%" "103%" ... $ X..Accommodation.available: chr "100%" "100%" "84%" "100%" ...

The Pop.to.In.Use.CNA column is the ratio of the population to the CNA. So if the CNA is 500 and the population was also 500, it would be 100% (i.e. completely full to a safe standard).

To clean the data up:

#remove percentage sign and convert to numbers prisons$Pop.to.In.Use.CNA <- gsub("%","",prisons$Pop.to.In.Use.CNA) str(prisons) prisons$Pop.to.In.Use.CNA <- as.numeric(prisons$Pop.to.In.Use.CNA)

It would be useful if we could have a column that told us whether the prison was too crowded or not. Let’s create a blank one for that purpose.

#add empty field for overcrowding prisons$over <- NA

We’re also going to remove the data for Blantyre House because it has no in use CNA.

#remove Blantyre House prisons <- prisons[-118,] #its row number

If you remember from the last post, we created an if statement that we ran as a for loop to fill in a column.

We’re going to do that again today, but we will build on the last post and actually create our own function to do the job.

#create overcrowded function overcrowded <- function(i) { if (prisons$Pop.to.In.Use.CNA[i] > 100) { prisons$over[i] <- 1 assign('prisons',prisons,envir=.GlobalEnv) } else { prisons$over[i] <- 2 assign('prisons',prisons,envir=.GlobalEnv) } }

You can see that the if/else statement is contained within the function’s curly brackets. Our function is called overcrowded and takes ‘i’ as its argument (‘i’ being the row number of the prisons data frame, in this case).

We have to use the assign function within ours to get R to recognise that we are dealing with our previously-defined data frame.

So if the ratio is greater than 100, it will be given a value of 2. If not (i.e. if it is 100 or less) it will take 1.

Now, the same as before, we run the for loop for the duration of our data frame, in this case 117 times (our data frame is 117 rows long).

#run in for loop for (i in 1:117) { overcrowded(i) }

Now we are ready to plot

p <- ggplot(prisons, aes(x = Prison.Name, y = Pop.to.In.Use.CNA, color= factor(over))) + geom_point(size = 3) + scale_colour_manual(values = c("red","blue"), guide = guide_legend(title = "Ratio to CNA"), labels = c("over","under")) p <- p + ggtitle("Overcrowding in prison") + labs(x = "Prison", y = "% full to uncrowded capacity") + theme(axis.text.x = element_blank(), plot.title = element_text(size = 60), axis.title.x = element_blank(), axis.title.y = element_text(size = 40), axis.ticks.x= element_blank(), axis.ticks.x= element_blank(), axis.text.y= element_text(size = 20), legend.title = element_text(size = 20), legend.text = element_text(size = 14)) p

Entering this code gives us this plot:

Each dot represents a prison.

A red dot indicates its population is over the safe limit, while a blue dot indicates under the cut-off (i.e. 100% or less).

We did this by using the color aesthetic and factoring prisons$over.

It’s a good idea to remove the prison names because there is no way we could have them all legible on the same plot.

It would be great if we could order this plot by the y axis because we want to illustrate how many prisons are over the same, uncrowded CNA limit.

Here is how we do that, using factors again:

prisons$Prison.Name <- factor(prisons$Prison.Name, levels=prisons$Prison.Name[order(prisons$Pop.to.In.Use.CNA)])

The result:

Analysis

The graph shows at a glance just how crowded prisons in England and Wales are.

More than half of the prisons are above their CNA.

Ten are more than 50 per cent overcrowded.

HMP Leeds, at the top, is 76 per cent overcrowded.

Conclusion

We’ve made some advances in R in this post by creating our own function and wrapping it in a for loop.

In doing so, we’ve shown how the prison system is crammed almost full to the brim in England and Wales.