With less than a month to go it is interesting to look at the referendum on Scottish independence from a statistical perspective. Amid obvious wider importance there is also an interesting statistical angle to the debate. Celebrity statistician Nate Silver [1] has been variously quoted as saying there is “no chance” of a yes vote [2]. Nate Silver is well known for causing controversy with his analysis of US elections [2]. This bleak assessment of the probability of a “yes vote” has also caused controversy in Scotland [3].

Historically, the Act of Union between England and Scotland was signed as far back as 1707. In more recent times the movement for Scottish independence gained increasing prominence in the aftermath of Britain’s decline following the Suez crisis. The first Scottish National Party (SNP) MP was elected to the UK parliament in 1967 and by the 1970s the SNP had emerged as an influential force in national UK politics. Following Labour’s landslide general election victory in 1997 the Scottish Parliament was formed with its first parliamentary elections taking place in 1999. A referendum on Scottish independence emerged as a manifesto pledge in the lead up to the 2007 Scottish Parliamentary elections but did not come into being as the SNP operated as a minority government in that parliament. In 2011 the SNP won an outright majority in the Scottish Parliamentary elections and a referendum on Scottish independence was finally agreed in 2012.

Having recently played host to the Commonwealth Games Scotland retains a proud cultural and sporting history. Both English and Scottish teams appeared to enjoy a post London 2012 Olympic “bounce” with several notable Commonwealth successes for both teams. There are many close sporting links (and on occasion intense rivalries) between Scotland and the rest of the United Kingdom. However, the relationship between England and Scotland clearly runs much deeper. There is significant population movement between England and Scotland in both directions. Roughly 50% of Scots people have relatives living in England. Moreover, there are close economic, financial and military links between Scotland and the rest of the United Kingdom. Against this backdrop uncertainties caused by the possibility of a “yes vote” for Scottish independence loom large.

Scottish identity and Scottish independence is as much about passion, flesh and blood, heart and soul, etc as it is about statistics [3]. As a simple illustration the non-partisan website What Scotland Thinks [4] provides information on a wide range of issues such as attitudes to independence both in Scotland and throughout the rest of the UK, perceptions of devolution and Scottish national identity. Sections of the website also specifically examine the attitudes of young people amid several interesting demographic breakdowns of opinions by age and gender about key issues in the independence debate.

When it comes to the Scottish Referendum both Englishmen and statisticians alike need to tread carefully. The debate over Scottish independence is clearly highly emotive [3].

In [3] Nate Silver’s assessment of there being “no chance” of a yes vote is criticised on the basis that polling in the UK is less developed compared to the US (where Silver has had demonstrable success). The available data from opinion polls does appear somewhat volatile (see below).

Further, there appears to be some ideological opposition to the use of quantitative methods in circumstances such as these [3]. Clearly there is a richness to Scotland and Scots alike that cannot be described in purely quantitative terms. However, as regards predicting the Referendum these criticisms appear to be largely moot.

The Efficient Markets Hypothesis (EMH) lays down the theoretical basis of prediction via adequately functioning financial markets. This is reinforced by a growing body of research into so-called Prediction Markets [5]. Put simply, if large sums of money are at stake bookmakers have every incentive to estimate probabilities accurately. This is especially true if we average over a range of different bookmakers who are in direct competition with each other. If they do not estimate the odds accurately enough evolutionary finance tells us that bookmakers will eventually become extinct [6]. So even if data from opinion polls is flawed, and on occasion may even be susceptible to various forms of political bias [2], evidence suggests combining bookmaker odds should result in a reasonable estimate of the probabilities in question.

It is interesting to try to predict the result of the Scottish independence referendum using methods from probability and statistics. In [7] the authors do this by looking both at the results of opinion polls and bookmakers’ odds pointing out that the two methods give slightly different answers. Here, we seek to combine these two methods.

Opinion Polls

The non-partisan website [4] gives the results of several polls which asked the question “should Scotland be an independent country”. Data is available on this from February 2013 to August 2014. After removing “don’t know” responses we plot the average number of people in favour of independence in each month (see Figure 1).

Figure 1: Monthly average responses to the question should Scotland be an independent country?

Results show that the opinion polls themselves do seem to be intrinsically uncertain and subject to a substantial amount of random variation. However, it is clear that the proportion of people in favour of Scottish independence seems to be increasing over time. Performing a simple regression analysis a t-test (not reported) shows that this result is statistically significant. (Following helpful comments from a reviewer who suggested that the residuals may be autocorrelated a simple econometric test which is not reported found no evidence of first-order autocorrelation in the residuals.)

However, whilst the proportion in favour of Scottish independence does appear to be growing over time it does not appear to be increasing at a rate that would suggest a high probability of a “yes vote”. Extrapolating the fitted model one month ahead from August to September suggests that 43.9% would vote in favour of independence. The associated 95% prediction interval is (37.7%-50.1%) again reflecting a large amount of uncertainty in the results of the opinion polls. Given this uncertainty it is perhaps just possible that the Scots might vote marginally in favour of independence. However, such a result would represent a marked departure from the results of previous opinion polls (see Figure 1).

Estimating Probabilities from bookmaker's odds

It is an interesting exercise to see how bookmaker odds correspond to implied probabilities. One bookmaker stated that the odds of Scotland voting No in the referendum were 1/9. This means that the probability of a no vote can be calculated by setting (1-p)/p=1/9; p=0.9.

The same bookmaker also gave the odds of Scotland voting YES in the referendum as 5/1. In this case this means that the probability of a no vote can be calculated as p/(1-p)=5/1; p=5/6.

It is standard econometric practice to compute the estimated probability as the average of these two values: p=1/2(0.9+5/6)=13/15.

The spread is given by (0.9-5/6)/(13/15)=0.0769. This figure of a spread of 7.7% appears in line with similar figures reported in [7].

Using the website [8] we can instantly check the implied probabilities relating to the listed betting odds of 21 different bookmakers. Using the above method the average implied probability of a “yes vote” was 0.149. The highest estimated probability of a “yes vote” was 0.178. The lowest estimated probability of a “yes vote” was 0.133.

We model the proportion of “yes votes” in the Scottish Referendum according to a Beta distribution under the Bayesian paradigm [9]. We set the mean value to be equal to the projected proportion from the opinion polls:

That is we set E[X]=A/(A+B).

Averaging the results over different bookmakers suggests that the probability that X is greater than 0.5 is 0.149. This leads to a set of nonlinear simultaneous equations that can be solved numerically in R. Omitting the full details a plot of the estimated probability density is shown below in Figure 2. A 95% Highest Density Interval (Bayesian confidence interval) is (0.327, 0.554) [9]. This suggests that with probability 0.95 the proportion of “yes votes” will lie between (33-55%). Whilst this might not be a very exciting prediction this clearly illustrates that the result of the Scottish Referendum is subject to quite a lot of uncertainty.

Figure 2: Estimated probability distribution for the proportion of yes votes

Scottishness and Scottish Independence is a hugely emotive issue that transcends traditional stereotypes of the “British stiff upper lip”. Despite imperfections evidence from opinion polls suggests that the movement for Scottish Independence is receiving increasing support. However, irrespective of considerable uncertainty about the actual final result the probability that Scots will vote for independence still appears quite low (around 15% according to most bookmakers).

Voting “yes” in an independence referendum has been linked to an increased willingness to accept risk [2, 7]. In addition to close cultural and personal ties between Scotland and the rest of the United Kingdom perhaps concerns over the potential severing of economic, financial and military links are beginning to dominate the debate.

References

[1] Silver, N. (2012) The Signal and the Noise: the Art and Science of Prediction. Penguin, New York.

[2] Champkin, J. and Oliver, A. (2013) Nate Silver: a life in statistics. Significance December 2013 36-39.

[3] McNaught, M. (2014) Nate Silver’s rash and uninformed prediction. NewsnetScotland.com

http://newsnetscotland.com/index.php/scottish-opinion/7860-nate-silvers-rash-and-uninformed-prediction

[4] What Scotland Thinks http://whatscotlandthinks.org/

[5] See e.g. The Journal of Prediction markets ubplj.org/index.php/jpm/index

[6] Evstigneev, I., Hens, T. and Schenk-Hoppe, K. R. (2009) Evolutionary finance. In Handbook of Financial Markets: Dynamics and Evolution, ed. T. Hens, K.R. Schenk-Hoppe, 507-564. Elsevier, Amsterdam.

[7] Bell, D. N. F. (2014) The independence referendum: Predicting the outcome. http://www.futureukandscotland.ac.uk/papers/independence-referendum-predicting-outcome

[8] Oddschecker website. (Stated odds correct as of 18/8/2014).

http://www.oddschecker.com/politics/british-politics/scottish-independence/referendum-outcome

[9] Lee, P. M. (2012) Bayesian Statistics: an Introduction, Fourth edition. Wiley.