Conventional wisdom holds that quantum mechanics is hard to learn. This is more or less correct, although often overstated. However, the necessity of abandoning conventional ways of thinking about the world, and finding a radically new way – quantum mechanics – can be understood by any intelligent person willing to spend some time concentrating hard. Conveying that understanding is the purpose of this essay.

Reading the essay requires a little more effort than most blog posts. The argument is occasionally a little abstract, and you may need to read over some paragraphs quite carefully, or perhaps more than once. Ideally, you’ll test your understanding by explaining the entire argument to someone else. The effort is worth it, for when you’re done, you’ll understand one of the great discoveries of all time: why the world needs quantum mechanics.

One of the challenges of understanding modern physics is that some of the concepts seem quite abstract when you’re talking about microscopic objects outside the realm of everyday experience. So let’s first get our bearings in a more conventional setting.

I want to talk about coins. We take it for granted that we can determine whether a coin has landed heads or tails; these seem like self-evident properties. But actually quite a lot is going on when we make that determination. Sunlight or some other type of light has to bounce off the coin, into your eye, stimulate your optic nerve, before finally registering either “heads” or “tails” in your brain [1].

This process of figuring out whether the coin is heads or tails is what physicists call a measurement process. In physicists’ language, what’s going on when we look at the coin is that we’re measuring a two-valued or binary property of the coin. This usage of the term measurement is somewhat different from everyday usage, where, for example, we might measure something with a ruler. But the basic idea is the same – a measurement is a process that determines a physical property, whether it be the length of an object, or the side a coin has landed.

All this language may seem pedantic – we’re just looking at a coin! But it comes in handy when we move to the microscopic realm of photons, the tiny particles that make up light. When you see red light, for example, what’s going on is that lots and lots of red photons are entering your eye. The more that enter, the brighter the red sensation.

Photons, like coins, can have binary properties. One of those properties is something called polarization. You’re probably already familiar with polarization, although you may not realize it. If you take a pair of sunglasess, and hold them up towards the surface of the ocean or a pool on a sunny day, you’ll notice that depending on the angle you hold the sunglasses, different amounts of light come through. What this means is that depending on the angle, different numbers of photons are coming through [2].

Imagine, for example, that you hold the sunglasses horizontally:

The photons that make it through the sunglasses have what is called horizontal polarization. Not all photons coming toward the sunglasses have this polarization, which is why not all of the photons make it through. In our earlier language, what’s going on is that the sunglasses are measuring the photons coming toward the sunglasses, to determine whether or not they have horizontal polarization. Those which do, pass through the sunglasses; those which do not, are blocked. Again, it’s not quite the everyday meaning of “measurement”, but hopefully you’re getting the hang of the physicists’ language.

There are other, different physical properties that can be measured in a similar way. For example, imagine holding the sunglasses at 45 degrees to horizontal:

The photons that make it through the sunglasses have a polarization at 45 degrees to horizontal. In our earlier language, these sunglasses are again measuring a binary property of the photons, in this case whether they have a polarization at 45 degrees to the horizontal or not [3].

Physicists routinely measure polarization in their laboratories. They don’t use sunglasses; they use “polarization photodetectors” instead. Despite the intimidating name, these are essentially just like sunglasses, but have a more convenient shape and size for laboratory use, are more accurate, less fashionable, and far more expensive.

I’m now going to describe an experiment involving photon polarization that physicists can do in their laboratories. We’ll build up the description of the experiment piece by piece. Along the way there’s a few details that may seem ad hoc – some angles of polarization measurement, and things like that. Don’t worry too much about those ad hoc details, just try to get the basic picture straight.

Let’s start by imagining an experimentalist named Alice. Alice is measuring a photon to determine whether or not it has horizontal polarization. Alice will record A = 1 when it does have horizontal polarization, and A = -1 when it does not.

Of course, Alice might have decided to measure a different polarization, say at an angle of 45 degrees to the horizontal. Alice will record B = 1 when it has a polarization at 45 degrees to the horizontal, and B = -1 when it does not. Here’s a picture summarizing the different things I want you to imagine Alice doing. By the way, I haven’t put the photon she’s measuring in, but you should imagine it coming into the screen, towards the sunglasses:

Let’s move briefly away from photons, and back to coins. The usual way we think about the world is that the coin is either heads or tails, and our measurement reveals which. The coin intrinsically “knows” which side is facing up, i.e., its orientation is an intrinsic property of the coin itself. By analogy, you’d expect that a photon knows whether it has horizontal polarization or not. And it should also know whether it has a polarization at 45 degrees to horizontal or not.

It turns out the world isn’t that simple. What I’ll now prove to you is that there are fundamental physical properties that don’t have an independent existence like this. In particular, we’ll see that prior to Alice measuring the A or B polarization, the photon itself does not actually know what the value for A or B is going to be. This is utterly unlike our everyday experience – it’s as though a coin doesn’t decide whether to be heads or tails until we’ve measured it.

That last paragraph may have sounded like gobbledygook. In fact, if it didn’t give you pause, I suggest you go back and reread it. The reason it’s difficult to understand is because the paragraph is really a declaration of non-understanding, a declaration that the world is radically different from our intuitive understanding.

To prove this, what we’ll do is first proceed on the assumption that our everyday view of the world is correct. That is, we’ll assume that photons really do know whether they have horizontal polarization or not, i.e., they have intrinsic values A = 1 or A = -1 (and, for that matter, B =1 or B = -1). We’ll find that this assumption leads us to a conclusion that is contradicted by real experiments. The only way this could be the case was if our original assumption was in fact wrong, i.e., photons don’t have intrinsic properties in this way.

This strategy may sound complex, but we reason similarly quite often in our everyday experience. Imagine your Aunt has shown you how to bake a cake. You decide to bake it on your own, but realize partway through that you’ve forgotten whether she said to put one or two cups of flour into the cake. You decide to proceed on the assumption that it was one cup of flour. Unfortunately, the cake falls and is a disaster; you conclude that your original assumption was wrong, and the cake must have needed two cups. In a similar way, if we proceed on the assumption that photons do have intrinsic values for A and B, and then arrive at a contradiction with experiment, we’ll know our original assumption must have been wrong.

Alright, let’s finish describing the experiment. In addition to Alice, the experiment involves another experimentalist, Bob, and a third person, Eve, who prepares two photons, and sends one to Alice, and one to Bob. When the photon gets to Alice, she measures one of the polarization values, A or B, as described above. She makes the choice of which to measure at random (e.g., by flipping a coin), for reasons which we’ll understand later. When the photon gets to Bob, he decides at random to measure either the polarization C, at 22.5 degrees to horizontal, or D, at 67.5 degrees to horizontal. Here’s a picture summarizing what’s going on, but leaving out Eve and the photons that she sent to Alice and Bob:

To make this all more concrete, let’s think about what might happen in a typical instance of the experiment. Over on Alice’s side, she decides to measure the B polarization of her photon, and gets the result 1, i.e., the polarization at 45 degrees to horizontal. Over on Bob’s side, he decides to measure the C polarization of his photon, and gets the result -1, i.e., the photon does not have polarization at 22.5 degrees to horizontal.

You might imagine Alice, Bob and Eve doing this experiment many times. If they did this, they could conveniently represent the separate runs of the experiment in a table:

A B 1 -1 1 -1 C D 1 -1 1 1

Each row of the table represents a single run of the experiment, so this table shows a case where they did the experiment four times. Looking at the first row of the table, we see that in the first run of the experiment Alice chose to measure A, and got the result 1, while Bob chose to measure D, and also got the result 1.

Now that we’ve understood how the experiment is performed, let’s move on to the analysis. Remember, we’re starting from the assumption that the respective photons have independently existing and well-defined values for A, B, C, and D. Two of these four values are revealed in any given instance of the experiment, depending on what Alice and Bob choose to measure. However, because all four quantities have (by assumption) an independent existence, we can consider quantities which involve all four, like the quantity Q defined by the equation

Q = AC + BC + BD – AD.

(Things like AC mean A times C – it makes the essay less messy to omit the multiplication sign.)

I must apologize for springing this quantity Q on you completely out of the blue. It’s as though a friend suddenly started reciting ancient poetry in mid-conversation; you would certainly wonder why. It turns out that the easiest way to understand this material is to accept the definition of Q for now, and move forward. With a little more work, we’ll see that thinking about Q leads to some very interesting conclusions. With those conclusions in mind, we’ll be able to double back, and understand better where Q came from.

Although Q’s definition may appear to have come from out of the blue, it’s certainly easy enough to calculate for any given set of values for A, B, C, and D. For example, when A = 1, B = -1, C = 1 and D = -1 we get

Q = 1 x 1 + (-1) x 1 + (-1) x (-1) – 1 x (-1) = 2.

In fact, it turns out that no matter what value A, B, C and D have, the value of Q is always equal to either 2 or -2. If you like, you can run through all 16 sets of possible values for A, B, C and D, and verify that Q is indeed always either 2 or -2. I won’t go through all that here, although I encourage you to pause and go through the exercise on paper [4].

Now, when Alice and Bob actually do an experiment, Alice chooses to measure just one of A or B, and Bob chooses to measure just one of C or D. So they can’t actually measure Q directly, although on any given run they can determine one of the four terms that make up Q, that is, they can always determine one of AC, BC, BD or -AD.

But if they repeat the experiment many times, Alice and Bob can build up average value for each of the four quantities AC, BC, BD and -AD. Because the total of these four quantities is always 2 or -2, as we’ve seen, the sum of their averages over multiple runs of the experiment can not possibly be more than 2:

Avg(AC)+Avg(BC)+Avg(BD)-Avg(AD) ≤ 2.

To understand why this is true, imagine you calculated the average population of all the countries in the world. Whatever the average is, it’s definitely going to be less than the population of China, which is the most populous country.

The inequality above is called the Clauser-Horne-Shimony-Holt (CHSH) inequality, after the names of its four discoverers. CHSH were building on earlier ideas of John Bell, who discovered a similar inequality in 1964.

You might wonder why we need to average in the CHSH inequality. Why can’t Alice measure both A and B, and Bob measure both C and D, so they can determine Q directly?

To understand this, remember that the idea we’re testing is the idea that the photon has an actual intrinsic value for A and an actual intrinsic value for B, each of which is merely revealed by the measurement. A single photon is quite delicate, and if Alice measured both A and B, there’s a chance the measurement of A would interfere with the measurement of B, and vice versa, and so mess up the measurement of Q. To keep things clean we force Alice to choose which one she wants to measure in any given instance, and stick to it. That’s why we have to work with averages over many experiments.

If you’re a bit more paranoid, you might also wonder if maybe Alice’s measurement could interfere with what Bob sees. This may seem unlikely, but it’s at least plausible. But Einstein’s relativity tells us that no influence can travel faster than the speed of light. If Alice and Bob do their measurements simultaneously and very quickly, nothing Alice does can possibly affect what Bob sees.

So, in principle, it ought to be possible for Alice and Bob to do the experiment many times, and work out the averages Avg(AC), Avg(BC), and so on, and check that the CHSH inequality does, in fact, hold.

An experiment testing this was done in the early 1980s, by Alain Aspect’s group, in France [5]. Experimentally, they found that if Eve prepares the two photons in just the right way, then what Alice and Bob see after many runs of the experiment is:

Avg(AC)+Avg(BC)+Avg(BD)-Avg(AD) ≅ 2.8.

That is, Aspect found that the CHSH inequality fails to hold in the real world! This means our belief that objects have intrinsic properties with their own independent existence must actually be wrong. The experimental failure of the CHSH inequality forces us to seek an alternate way of understanding the world, a way radically different from our conventional way of thinking.

Fortunately, a more radical theory of the world is available, a theory in which objects don’t have intrinsic properties that exist in and of themselves. That more radical theory is quantum mechanics. I won’t explain how the quantum mechanical analysis of the Aspect experiment works; that’s not the point of this essay. I will report though, that if you use quantum mechanics to analyze Aspect’s experiment, the prediction you get matches the experimental results exactly. In fact, Clauser, Horne, Shimony and Holt had already done the quantum mechanical analysis in advance of the experiment, and knew this. What the Aspect experiment did was provide a real-world example where the CHSH inequality demonstrably fails, yet quantum mechanics explains the results perfectly [6].

The analysis done in this essay can be extended to nearly all physical properties. In principle, it holds even for everyday properties like whether a coin is heads or tails, whether a cat is alive or dead, or nearly anything else you care to think of. Although experiments like the Aspect experiment are still far too difficult to do for these much more complex systems, quantum mechanics predicts that in principle it should be possible to do an experiment with these systems where the CHSH inequality fails. Assuming this is the case – and all the evidence points that way – at some fundamental level it is a mistake to think even of everyday properties as having an intrinsic independent existence.

You might wonder what this all means. Should you lose your belief in the idea that objects have intrinsic properties with an independent existence? Should you start thinking about your coins or your cat as though they might be in some indeterminate state? The answer, of course, is no: believing in such intrinsic properties is a perfectly good way to go about your everyday life. In fact, quantum physicists have spent quite a bit of time trying to understand why it is that so many properties in practice do behave like intrinsic properties with their own independent existence. The analysis is complex, but the final conclusion is unambiguous: for most practical everyday purposes, we can treat a coin as knowing whether it is heads or tails, and a cat as knowing whether it is alive or dead. Although these beliefs are not correct at some fundamental level, in most practical situations they work extremely well. It’s only in extraordinary circumstances quite outside everyday life that this way of thinking could ever lead you astray.

I promised that we’d go back and try to understand where Q comes from. In fact, Q was no less mysterious for Clauser, Horne, Shimony and Holt than it is for you. When they started their work, they had in mind an argument roughly like the one above (which was inspired by Bell) but they did not have a specific form for Q in mind. Their idea was to find a form for Q using trial-and-error so that they could prove an inequality like the CHSH inequality, and also simultaneously find a situation where quantum mechanics predicted that the inequality should fail to hold. That strategy allowed them to suggest an experiment – the experiment ultimately done by Aspect – which could be used to test between the two views of reality. I don’t know how long it took them to find their form for Q, but I suspect it took hundreds of hours of hard work. If you’ve been wondering what Q “means”, that’s your answer: it’s the answer to the question Clauser, Horne, Shimony and Holt’s were asking about what quantity would best let them distinguish between our usual picture of the world, and the actual reality. Given how long it took them to answer that question, it would not be surprising if you got a bit of a jolt when I introduced Q out of the blue.

The need for quantum mechanics isn’t ordinarily explained the way I have described in this essay. I think this is a pity, because the explanation here is, in my opinion, simpler, more compelling, and more clearcut [7] than the standard explanation.

The standard explanation is based on the historical development of quantum mechanics between 1900 and 1930. During that time there were a series of crises in physics. The pattern was that each time some experimental fact would be noticed that seemed hard to explain with the old “classical” way of viewing the world. Each time, physicists would bandage over the old classical thinking with an ad hoc bandaid. This happened over and over again until, in the mid-1920s, the sick patient of classical physics finally keeled over completely, and was replaced with the new framework of quantum mechanics.

The problem with this style of explanation, and what makes it confusing, is that none of those early crises was entirely clearcut. In each case, there were physicists who argued that the new experimental results could be explained pretty well with a conventional classical picture. And, in fact, with hindsight, we can now see that some of these crises have pretty good explanations that are essentially classical.

What’s beautiful about the CHSH inequality and the Aspect experiment is that they are so simple and compelling. They leave no doubt that we have to abandon our conventional assumptions about the world, and confront the need for a radically new theory. That theory is quantum mechanics.

Further reading

If you liked this essay, you may enjoy my essay “What makes quantum computers powerful”, to appear on this blog in two weeks time.

An excellent elementary introduction to quantum mechanics is Richard Feynman’s QED: The Strange Theory of Light and Matter .

Subscribe to my blog here.

You may enjoy some of my other essays.

Acknowledgments

Thanks to Dave Bacon, Jen Dodd, Mary Granade, Kate Nielsen, Amund Tveit, and Jo Vermeulen for feedback that improved an early draft of this essay.

About the author

Between 1995 and 2008, Michael Nielsen was a professional theoretical physicist. During that time he co-authored the standard text on quantum computing , proved one of the fundamental theorems about the behaviour of entangled quantum states, and participated in one of the first quantum teleportation experiments. None of this made him feel comfortable with quantum mechanics.

Michael is now a writer living outside Toronto, and working on a book about “The Future of Science”. A taste of the book may be found here. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. You’ll be emailed to let you know when the book is to be published; your email address will not be used for any other purpose.

Footnotes

[1] Of course, a coin might also land on its side. We’ll ignore that for the purposes of the present discussion.

[2] Not all sunglasses are polarizing in this way. But many are. You can check if your sunglasses are polarizing by holding them up towards pretty much any surface that reflects glare. The ocean or a pool on a sunny day work well.

[3] You might be wondering whether there’s any relationship between a photon having horizontal polarization, and having polarization at 45 degrees (or some other angle) to horizontal. This is a good question, and the answer is that there is a relationship. But it would take us quite a ways afield to understand the relationship, and we don’t need it for the purposes of this essay, so I’ve skipped over it.

[4] An alternate way of seeing that Q is always 2 or -2 starts by rewriting Q as

Q = (A+B)C + (B-A)D.

We can split our analysis up into two cases: the case when A = B, and the case when A = -B. One of these two must always be true, because A and B are both always either 1 or -1.

First case: A = B. In this case the B-A terms in Q vanish, leaving just contributions from the (A+B)C term. A bit of thought and experimentation should convince you this is either 2 or -2.

Second case: A = -B. The A+B terms vanish, leaving just contributions from the (B-A)D term, which again a bit of thought should convince you is either 2 or -2.

[5] Real experiments have imperfections, and Aspect and his co-workers had to use a careful analysis to take those imperfections into account. For example, the polarization photodetectors in the experiment would sometimes miss a photon, and this needs to be taken into account in analyzing the results. I won’t go into all those details here. More modern experiments are getting very close to the ideal experiment described in my essay.

[6] When people see the CHSH inequality and the results of the Aspect experiment for the first time, they sometimes say “oh, isn’t that just like the uncertainty principle, where particles don’t have a simultaneously well-defined position and momentum?” It is similar, but the contradiction of the CHSH inequality by experiment is a much stronger result. It’s true that the uncertainy principle does say that in quantum mechanics, a particle can’t have a simultaneously well-defined position and momentum. But this is just an assertion about the theory of quantum mechanics. The CHSH inequality and the Aspect experiment give us a direct experimental disproof of the idea that a particle has real intrinsic properties with their own independent existence.

[7] There are still a few people who believe that it’s possible to avoid the conclusion that the CHSH inequality and Aspect’s experiment force on us. There are two common lines of attack. The first is to argue that something Alice does can instantaneously influence what Bob sees, but in a way that doesn’t allow faster-than-light signalling. This is an interesting line of thought, but is in its own way also quite a radical departure from classical thought. The second is to argue that somehow the fact that the polarization photodetectors sometimes miss a photon is responsible for the failure of the CHSH inequality. Both these lines of attack continue to be developed, although neither is regarded as mainstream.