If you read this blog or work in the same space as I do (what I would broadly describe as computational social science), you are probably aware of the trend for physicists to make a lateral leap to researching social systems. This can be a bit of a bumpy journey since those, such as me, trained in physics are prone to making some unrealistic simplifications and assumptions. At best, these simplifications make exact analytical solutions more feasible, and often do not matter; averaging out harmlessly or providing a vital null model against which other observations, including a real effect, can be compared. At worst, this approach yields unrealistic findings or is dismissed as modeling ‘people as atoms’ and frankly doesn’t interest anyone except other physicists analysing social systems in equally unrealistic ways (check out Phillip Balls recent deep dive on the history of this phenomenon).

I truly believe that everyone struggles to throw off the perspective imposed by the things they were exposed to at a formative age, and so it is my lot to forever bear the burden of a physicist’s mindset. Fighting to stop myself from reducing everything I lay eyes upon to a phase transition or a hysteresis loop.

It was with this same perspective that I gladly attended a talk by the Behavioural Insights Team (BIT) organised by the UNDP innovation group. I attended gladly not just on patriotic grounds; the BIT span out from the UK cabinet office, but because I recall attending a great talk on ‘nudges’ by the father of behavioural economics interventions Cass Sunstein at Harvard Business School some years ago. The BIT makes a business out of small tweaks to processes in social systems, or nudges in order to optimise them. For example, asking people to sign their tax return at the top rather than the bottom leads to more honest reporting.

What was interesting to me, was the contrast between the BIT, working in government, and tech companies’ use of essentially the same idea: giving a randomly selected group of people a different experience and measuring the difference in the outcome to see if there is a statistically significant improvement. This is known as a ‘randomised control test’ by the BIT and more commonly known as an ‘A/B test’ in industry.

A/B tests really hit the mainstream in an unfortunate way in 2014, when researchers investigated the contagion of positive and negative emotion on Facebook users by purposefully exposing a group of users to particularly emotional content on their news feed. While I personally believe that the outrage in the face of this work was out of proportion, I also realised that people do not appreciate the scale of these constant ongoing experiments to test new features or products. A common user test might be to investigate whether sending a newsletter on a Monday rather than a Friday leads to more clicks, or as trivial as changing the shade of blue of a button to see if people interact with it more favourably. But many of these are constantly ongoing on platforms such as Facebook, Amazon et al.

The difference between the A/B test in industry and the BIT’s randomised control tests is that the nudges used by the latter are fewer and more carefully informed (through user interviews and consultation) than the former. Quite rightly, it might be argued, since private companies do not have the same burden of accountability as public bodies. While both have some element of magic and ‘if it works, it works!’ to them, both involve implicitly accepting that we are purposefully experimenting on people and knowing that some will, as a result, have a worse experience.

This is quite troubling for any right minded person to think too hard about. But this is where I reach for my contemporary physics exemplars. If we are to truly strive for a scientific understanding of human behaviour, which may be as prosaic as making informed decisions or conducting meaningful monitoring & evaluation of processes and outcomes, we must either embrace a willingness to make invasive measurements of individuals or conduct randomised control trials.

This might seem a strong claim, but physical science provides a strong precedent. When seeking to learn about matter before we had microscopes and atomistic models, we had to blindly test things on macroscopic pieces of wood, metals, glasses, alloys and so forth by randomly changing their conditions and state until something worked. Even the great Thomas Edison, when inventing the electric light in 1879, blindly added paper, cotton and wood to the metal filament not understanding the underlying microscopic processes at play (the chasm existing in the 20th century between the pragmatic engineers as inventors and the lofty European gatekeepers of physical scientific knowledge is brilliantly discussed in the retelling of the history of Edison’s lab: The Idea Factory).

Before we had the luxury of high resolution electron microscopes and neutron diffraction machines, we had to simply try things. Heat them up, cool them down, change their shape, size, apply a magnetic field until they did what we wanted, better. These pragmatic experiments were the A/B tests of science.

For us to even begin to approach the same mastery of human behaviour as we have the physical world, we simply must be prepared to either invasively measure individuals to gather enough (tightly guarded) data on their actions to form a robust theory, or dumbly observe changes in aggregate behaviour under changes in environment with no theoretical underpinning. The only way this can be done is through extremely privacy sensitive data collection in the former case, or accepting that a small degree of experimentation is a necessary evil in pursuit of discovering what makes society work better.