The framework

We formulate our result in the abstract state space framework4,5,6,7. This framework arises from the idea to consider the largest possible class of physical theories (more precisely, generalized probabilistic theories) which satisfy minimal assumptions, containing classical and quantum theory as special cases. This allows us to study properties of quantum theory, like the non-discreteness of the state space, from an outside perspective. Here, we discuss these minimal assumptions very briefly and refer to Pfister8 for a detailed introduction to the abstract state space framework and its mathematical background.

The framework, which relies on four minimal assumptions, is based on the idea that any physical theory admits the notions of states and measurements. Their interpretation is assumed to be given. The first assumption is that the normalized states form a convex subset of a real vector space A. The underlying motivation is the idea of probabilistic state preparation: if are states that can each be prepared by a corresponding preparation procedure, then executing the preparation procedures with probability p and 1−p should also lead to a state (described by the convex sum ), which should therefore be an element of as well. The second assumption is that the dimension of the vector space containing the set of states is arbitrarily large but finite. This is a purely technical assumption intended to make the involved mathematics feasible. The third assumption is that the set of states is compact. Although there might be some physical motivation for this assumption, we shall be satisfied with considering it as a technical assumption.

Before we discuss the fourth assumption, we make a few comments on the structure of . The extreme points of are the pure states of the system, the other elements are called mixed states. As is a convex and compact subset of a finite-dimensional vector space A, every element of is a convex combination of the extreme points of (ref. 9). Thus, every state is a convex combination of pure states. As a convex combination is a sum with positive weights that sum up to one, a state can be seen as a probability distribution over pure states. In general, this probability distribution is not unique. In classical theory, however, it is (see the example below). In addition to the normalized states , an abstract state space A also contains the subnormalized states , which are given by all rescalings of the normalized states by factors between zero and one.

The fourth assumption states, roughly speaking, that every mathematically well-defined measurement is regarded as a valid measurement: a measurement is a finite set ={f 1 ,…,f n } of functions that are called effects, each corresponding to an outcome of the measurement. For a state , the value is interpreted to be the probability that the measurement yields the outcome i when the system was in the state before the measurement. Thus, one must have for all . If the measured system was in the state with probability p and in the state with probability 1−p, then the probability of getting the outcome has to be identical to as is regarded to be a state in its own right (in accordance with the first assumption). Skipping a few details, this means that effects are assumed to be linear. Moreover, the effects of a measurement have to sum up to the so-called unit effect for which for all (as the probability that any outcome occurs has to be one). The fourth assumption is that every set of such linear functionals (effects) is a valid measurement. We denote the set of all effects on an abstract state space by , and we denote measurements (that is, sets of effects that sum up to the unit effect) by calligraphic letters (M or N in this paper).

We would like to emphasize that the fourth assumption, which connects the geometry of the states with the geometry of the effects8, is standard but non-trivial and of crucial technical importance for our result. A compelling physical motivation does not seem to be obvious, so it should be regarded as a tentative assumption on the way to a better understanding of quantum theory. Note that as a consequence of this assumption, a theory where the set of states is a quantum state space but where the measurements are restricted to a proper subset of the positive operator valued measures (POVMs) is not part of the framework (c.f. quantum theory in the examples below). In quantum information science, it is always assumed that the full set of POVMs can be performed.

These four assumptions determine the framework of abstract state spaces. This structure is sufficient as long as one is only interested in measurement statistics of one-shot measurements. If one wants to describe several consecutive measurements, one has to introduce measurement transformations. We will discuss this below, but first, we make a few examples.

In the following, we introduce a few examples of theories that can be formulated in the abstract state space framework. (More examples can be found in 8.) While quantum and classical theories are theories of actual physical significance, other theories that we introduce have the role of toy theories, which are helpful to understand the framework. Especially the square and the pentagon, which are instances of polygon models (see below), will serve as useful examples in the illustration of the proof idea of our result.

As a first example, let us have a look at quantum theory. The set of states of a (finite-dimensional) quantum system is given by Ω A =() for some (finite-dimensional) Hilbert space , where () denotes the positive operators on with unit trace (the density operators). These operators form a compact convex subset of A=Herm(), the vector space of Hermitian operators on . Every quantum system has continuously many pure states. The most general description of measurement statistics in quantum theory is given by a POVM, which is a set of positive operators that sum up to the identity-operator I on . They give rise to the effects that sum up to the unit effect given by for all . In analogy to our comment above, we emphasize that a theory where the states form a proper subset of a quantum state space but where the measurements are given by not more than POVMs fails to satisfy the fourth assumption of the framework because a reduction of the allowed states requires an extension of the effects.

Another example is classical theory. The states of a (finite) classical theory are given by a simplex, that is, by the convex hull of finitely many affinely independent points. (We say that points in a real vector space are affinely independent if no point is an affine combination of the other points, that is, if for every , there are no real coefficients with such that .) Examples of simplices are given by a line segment, a triangle, a tetrahedron, a pentachoron and so on. Every element of a simplex is a unique convex combination of the extreme points of (Fig. 2). Thus, for a simplex , the states are in a one-to-one correspondence with the probability distributions over the pure states, which in the case of a simplex are perfectly distinguishable. This allows to interpret the pure states as classical symbols. In a classical system, there is a generic measurement. For a given state , the outcome probabilities for this measurement are precisely the coefficients in the convex sum of the pure states that yield .

Figure 2: (Non-)Uniqueness of convex decompositions. A classical system is described by a simplex, which has the property that every point is a unique convex combination of the extreme points. Thus, a state in a classical system corresponds to a unique probability distribution over classical symbols. Full size image

A more general class of examples is given by what we call discrete theories. We say that is a discrete state space if it is the convex hull of finitely many (not necessarily affinely independent) points. As is compact, this is equivalent to saying that the theory has only finitely many pure states. Classical theory is an example of a discrete theory, while quantum theory is not.

Very illustrative examples are given by the polygon models10: these are abstract state spaces where is a regular polygon, so they are special kinds of discrete theories. As the whole situation can be drawn in only three dimensions, the polygon models provide examples for which we can give a picture (Fig. 3). To see the interplay of states and effects in such a low-dimensional example, it is useful to represent effects as vectors in the same space as the states10. To evaluate an effect at some state, one simply takes the scalar product of the state and the vector representing the effect. In the Methods section below, the square and the pentagon will be the central examples in the illustration of the proof idea.

Figure 3: The square polygon model. The upper part of the figure shows the set of normalized states (grey), together with the subnormalized states (white ‘pyramid’), which are given by all rescalings of normalized states with factors between zero and one. In the lower part of the figure, the subnormalized states are omitted. Instead, the effects E A are shown (here they correspond to an octahedron). The reader who is familiar with the mathematics of ordered vector spaces may notice that the effects arise from the structure of the dual cone (more precisely, the effects form an order interval [0, u A ] in A)8. Here, they are represented as vectors in the same space as the states. To calculate a probability , one simply takes the scalar product of the vector and the vector representing f. Full size image

So far, we have discussed the core structure of abstract state spaces: states and effects. They only allow for the description of one-shot measurement statistics. If one wants to describe the statistics of several consecutive measurements, then one has to specify what happens to the state of the system when a measurement is performed (otherwise, the statistics of the subsequent measurement cannot be described). In other words, one has to specify a rule for post-measurement states. The structure of an abstract state space, however, does not provide such a rule and leaves open the question of how to specify post-measurement states.

We deal with this question and consider some extra structure on abstract state spaces that provides a rule for post-measurement states. We describe the transition from the initial state of the system (before the measurement) to the post-measurement state by what we call a measurement transformation. Such transformations have been considered, for example, in 11,12,13. We go one step further. Our result makes a statement about the existence of measurement transformations in abstract state spaces that satisfy a certain postulate.

As we have just mentioned above, the general idea is that a measurement transformation specifies a rule for how post-measurement states are assigned. However, in a physical theory, how such a rule looks like depends on the particular situation that one wants to describe. To be more specific, we can think of at least three such situations (we will make quantum examples below), which correspond to the case where (a) the observer finds out the outcome of the measurement and describes the state of the system after the measurement conditioned on that outcome; (b) the observer describes the system after the measurement by a subnormalized state for the hypothetical case that a particular outcome occurred, incorporating the probability of that outcome into the post-measurement state; and (c) the observer does not find out the outcome of the measurement and describes the state of the system after the measurement, knowing only that the measurement has been performed. A physical theory has to allow for a mathematical description for all of these cases. Each of the three situations can be described by a particular kind of map. To understand the difference between them, it is helpful to see how these maps look like for the particular case of quantum theory. There, if the measurement is a projective measurement , the maps are given by Lüders projections14 (the literature is ambiguous about which of the three maps is called a Lüders projection, but as they are very closely related, this usually does not lead to problems). The situations (a), (b) and (c) above are described by the following maps: in situation (a), if the outcome associated with projector P k is measured, then the state is transformed as

In situation (b), considering the outcome associated with projector P k , the state transforms into a subnormalized state as

In situation (c), if the outcome of the measurement is unknown, the state is transformed as

Most introductory textbooks on quantum theory only discuss situation (a). Note that (a) is not a linear map. By the definition that we will make below, it should not be called a transformation. The maps (b) and (c) are linear. The map (b) describes what Lüders calls a ‘measurement followed by selection’, whereas the map (c) describes what he calls a ‘measurement followed by aggregation’14.

The preceding discussion allows us to understand what we mean by a measurement transformation. By a measurement transformation, we mean a map of type (b). Note that such a map leads to subnormalized post-measurement states rather than normalized ones. The norm of the post-measurement state (the trace-norm in the quantum case, ) is equal to the probability that the outcome occurs (which is what we mean by ‘the probability of that outcome is incorporated into the state’).

Choosing maps of type (b) (rather than maps of type (a) or (c)) as the subject matter is not a relevant restriction as the three types of maps are so closely related that insights into one of these maps translate into insights into the other maps as well. In particular, from the map of type (b), one can construct the map of type (a) by rescaling the images with the inverse probability and the map of type (c) by summing up over all outcomes.

With the above motivation in mind, we now proceed to the task of formally defining what we mean by a measurement transformation on an abstract state space. A transformation T on an abstract state space A is a linear map T: A→A such that . The motivation for the linearity of transformations is similar to the motivation for the linearity of effects. The linearity expresses a compatibility condition for probabilistically prepared states: if the system is in a state with probability p and in a state with probability before the transformation, then the transformed state has to coincide with as is regarded as a state in its own right. (A more rigorous argument would require for all effects f, which eventually boils down to what we have just required.) A measurement transformation has to satisfy one more condition. As we have explained above, a measurement transformation is associated with a particular outcome, or more precisely, with a particular effect. If T is a measurement transformation for an effect f, then we require that the norm of the transformed state is equal to the probability for measuring the outcome associated with f. In short, we require

In quantum theory, where u A is given by the trace, this property is satisfied for projective measurements as the Lüders projection gives .

We will only consider measurement transformations for a special class of effects that we call pure effects. We say that an effect is pure if it is an extreme point of the (convex) set of effects E A , and we say that a measurement ={f 1 ,…,f n } is pure if every effect is pure. It turns out that in the case of quantum theory, an effect of a POVM element F is pure if and only if F is a projector8. Thus, we only consider measurement transformations for a class of effects that, in the case of quantum theory, reduces to projectors. For this class, the measurement transformations are given by Lüders projections. The fact that we will restrict our considerations to pure effects is not a restriction of the validity of our result. Quite the contrary, this makes our result stronger. As we will see below, our postulate claims a property of measurement transformations for pure effects rather than claiming this property for all effects. This results in a weaker postulate, so every implication derived from this postulate leads to a stronger result. As we will see later, we will restrict the claim of the postulate to an even smaller subclass of effects (see the Methods section and the Supplementary Note 1 for further details).

In a nutshell, a measurement transformation for a pure effect f is a linear map T: A→A with and .

The postulate

Before we can formulate our result, we first state our postulate. For a mathematically precise formulation, we refer to the Methods section and the Supplementary Note 1 of this article.

The postulate reads: Every pure measurement can be performed in a way such that the states for which it yields a certain outcome (that is, the states with an outcome of probability one) are left invariant. In more illustrative terms, this can be rephrased by saying that no information gain implies no disturbance.

In more technical terms, the postulate states that for every pure effect , there exists an associated measurement transformation T with such that for all states with , we have that . The existence of such a measurement transformation T is what is meant by saying that there exists a way to perform the measurement. Furthermore, note that without looking at the definition of a measurement transformation, saying that ‘there exists a way to perform the measurement’ may appear trivial by itself. After all, doing nothing and outputting the measurement outcome (associated with) f preserves and yields f with probability 1. This case is ruled out by the definition of a measurement transformation. More precisely, note that T must be such that for all states . That is, it yields the correct probabilities for any state that we wish to measure. It is interesting to note that the actual proof of our main result only needs an even weaker but rather technical requirement (see the Methods section).

To see the link to information gain, note that the Shannon information content (see, for example, 15) is zero for any outcome of an experiment that occurs with certainty. As such, is equivalent to stating that no information gain occurs. The demand that says that the state is unchanged, that is, no disturbance has occurred.

Quantum theory and classical theory satisfy this postulate. In quantum theory, for example, if a system is in a state such that a projective measurement has some outcome k with probability , then the transformation leaves the state invariant. Quantum theory even satisfies the postulate in a much stronger form in the sense that little information gain also causes only little disturbance. This can be seen from a special case of the gentle measurement lemma16,17. It states that if measuring an outcome associated with a projector F has probability , then measuring that outcome disturbes the state by no more than . Setting , this reduces to our postulate. However, we emphasize that our postulate is much weaker than postulating the gentle measurement lemma. We also note that our postulate does not make any assumptions about locality, that is, it does not make a statement about whether verification measurements of bipartite states can be implemented on local quantum systems or locally disturb the state as has been considered in Popescu and Vaidman18.

Even though the statement of the postulate is very concise, it may appear unsatisfying as it involves the abstract concept of a state, which is something that one cannot observe directly. However, it can be reformulated in purely operational terms, referring only to directly observable objects, namely measurement statistics. Such a reformulation is possible because two states can be regarded as being identical if and only if they induce the same measurement statistics for every measurement (in more mathematical terms, a state is an equivalence class under the relation for all )19. Hence, instead of making statements about states, one can make statements about the statistics of all potential measurements. Figure 4 illustrates the idea of this reformulation.

Figure 4: A reformulation of the postulate in purely operational terms. Instead of referring to initial and post-measurement states, the reformulated version states that a measurement with a definite outcome does not influence the statistics of any subsequent measurement, so it only refers to directly observable quantities. This reformulation can be understood as follows: consider a preparation that outputs an initial state and a measurement ={f 1 ,…,f n } such that for some k. According to the postulate, the state of the system after the two experiments shown in (a) are identical. Thus, if the two experiments are followed by any measurement, say ={g 1 ,…,g l }, then the statistics of the -measurement coincide (see part (b) of the figure). The -statistics coincide for every measurement . This is equivalent to saying that the states before the -measurement (that is, and ) are identical. Thus, we do not need to refer to states and can reformulate the postulate as: if a measurement has a definite outcome, then performing this measurement does not influence the statistics of any subsequent measurement. This is shown diagrammatically in part (c) of the figure. Full size image

Main findings

In terms of the postulate, our result can now be stated as follows: an abstract state space that satisfies the postulate is either non-discrete (that is, it has infinitely many pure states) or it is classical.

This means that if a physical system is described by an abstract state space where the set of states is a polytope that is not a simplex (that is, if it is a discrete non-classical system), then it violates our postulate.

Furthermore, our result is robust in the sense that discrete non-classical theories are ruled out even if the postulate is weakened to an approximate version. To formulate this approximate version of the result, we assume that A is equipped with a norm . This induces a distance function on A. We prove that for every discrete non-classical theory, equipped with some norm , there is a positive number such that the implication (where T is the measurement transformation for f) cannot be satisfied for every pure effect . We prove this approximate case, which is a stronger version of the result, in the Supplementary Note 2.