Classical physics and quantum mechanics are the only two frameworks for physics that are worth mentioning. And it's quantum mechanics that is more true in Nature, that is more fundamental, and that is more legitimate as the starting point. Classical physics may be derived as a limit of quantum mechanics but quantum mechanics can't be obtained by any similarly straightforward, guaranteed-to-succeed procedure from classical physics.



And yet, quantum mechanics remains wildly misunderstood and underestimated. Many people, including professional physicists, can't resist their primitive animal instincts and they keep on trying to rape quantum mechanics, insert their prickly objections and modifications into it, and make it more classical. However, quantum mechanics is well protected and it can't get pregnant with bastards. It's just patiently saying "f*** off" to these deluded non-physicists and equally deluded physicists.



Even those who realize that quantum mechanics – the framework respected by Nature – is fundamentally different than classical physics and that there won't be any counterrevolution that would make physics classical once again often underestimate the rigidity and uniqueness of the universal postulates of quantum mechanics. They think that many things could be altered, mutated, and quantum mechanics has many possible cousins and it's an accident that Nature chose this particular quantum mechanics and not one of the cousins.



They're wrong, too. In this text, I will demonstrate why certain properties of quantum mechanics are inevitable for a consistent theory.









Complex numbers are the only allowed number system for amplitudes



First, let us imagine a cousin of quantum mechanics where wave functions \(\ket\psi\) take values in a Hilbert space that isn't complex: let's try to replace \(\CC\) by \(\RR\), \(\HHH\), or something else.









If we're not allowed to multiply the state vector by the imaginary unit \(i\), e.g. if we try to work with the real numbers \(\RR\), we're immediately in trouble. Schrödinger's equation says that\[



i\hbar\frac{\dd}{\dd t}\ket\psi = \hat H \ket\psi



\] and the coefficient is pure imaginary. This pure imaginary character of the coefficient is what is needed to preserve \(\braket\psi\psi\), the norm that is interpreted as the probability. For energy eigenstates, the equation says that only the phase is changing with time. With a real coefficient, the wave function would exponentially increase or decrease with time – and so would the total probability of all mutually excluding properties of the physical system.



I will discuss the need for "unitarity of the evolution" momentarily.



We began with Schrödinger's equation as a place where the imaginary unit \(i\) appears but as you know, I don't consider this equation to be excessively "superfundamental" in quantum mechanics. One may show – and Dirac has shown – that this equation is equivalent to the Heisenberg picture in which the state vector is constant but the operators evolve according to the Heisenberg equations of motion\[



i\hbar\frac{\dd}{\dd t}\hat L = [\hat L, \hat H]



\] Needless to say, the coefficient \(i\hbar\) in this equation may be shown to be the same \(i\hbar\) we had in Schrödinger's equation. In this picture, we may offer many independent explanations why the coefficient has to be pure imaginary. For example, if the operator \(\hat L\) is required to be Hermitian at all times – as appropriate for observables, as we will discuss – its time derivative has to be Hermitian, too.



However, the commutator of two Hermitian operators is anti-Hermitian, i.e. it obeys\[



\eq{

[\hat L, \hat H]^\dagger &= (\hat L \hat H - \hat H\hat L)^\dagger =\\

&= \hat H\hat L - \hat L\hat H = [\hat H,\hat L] = -[\hat L, \hat H]

}



\] where I have used \[



\hat H^\dagger = \hat H, \quad \hat L^\dagger = \hat L, \quad (\hat X\hat Y)^\dagger = \hat Y^\dagger \hat X^\dagger.



\] If we want to express a Hermitian operator using this anti-Hermitian commutator – a candidate for the time derivative of a Hermitian operator has to be Hermitian – we have to multiply the commutator by an imaginary constant, one we call \(i\hbar\), which erases "anti-" from the adjective.



We don't really need to discuss the Hamiltonian and time evolution at all. Think about Heisenberg's "uncertainty principle" commutator\[



[\hat x,\hat p ] = i\hbar.



\] A few paragraphs above, I proved that the commutator of two Hermitian operators is actually anti-Hermitian. So if the commutator of these two particular operators is a \(c\)-number, i.e. a multiple of the unit operator, then the \(c\)-number has to be pure imaginary. Again, it's called \(i\hbar\) using the usual symbols and unit conventions of quantum mechanics. And once you accept that the commutator is a pure imaginary i.e. non-real operator, it follows that there can't be a basis in which both \(\hat x\) and \(\hat p\) would be expressed by real matrices; the commutator of any two real matrices is real as well which is no good to satisfy the relationship above!



So the imaginary unit \(i\) is clearly needed. You may try to go from \(\CC\) to the opposite direction than to \(\RR\), i.e. to larger number systems such as \(\HHH\) and \(\OO\). If you pick the quaternions \(\HHH\), it won't be lethal but the non-complex Hamilton numbers will be redundant. There are various ways to see it. For example, Schrödinger's equation or Heisenberg's equations will have one particular pure imaginary unit which we may still call \(i\) without a loss of generality. If we pick some "orthogonal" imaginary unit in the quaternions such as \(j\), the hypothetically quaternionic wave function will effectively split to two complex ones,\[



\ket{\psi_\HHH} = \ket{\psi_\CC}_1 + j \ket{\psi_\CC}_2



\] and these two state vectors labeled by the subscripts \(1,2\) will evolve independently from each other. The only physically meaningful interpretation of the wave function above will be equivalent to a density matrix that is obtained by mixing the two pure density matrices:\[



\rho_{\HHH,\rm equiv} = \ket{\psi_\CC}_1 \bra{\psi_\CC}_1 + \ket{\psi_\CC}_2 \bra{\psi_\CC}_2.



\] You don't get anything fundamentally new. The "quaternionic wave function" will be intrinsically "reducible" and you may always study the elementary building blocks that the wave function may be reduced to – and they're complex. At least with a single time coordinate, you can't get anything really new that could be called "quaternionic quantum mechanics".



Only the complex numbers are tolerable as the "fair number system" for the coordinates of the state vector. Real numbers are complex numbers that are constrained by an extra condition – one that is lethal for a physical interpretation, as we have pointed out; quaternions can't really show their muscles beyond their being a "pair of complex numbers".



This fundamental character of complex numbers holds even in "deep enough mathematics" that is detached from the physical conditions we have discussed. For example, if we talk about representations of groups – and the Hilbert spaces in any quantum mechanical theory are representations of groups and algebras of operators – the "default" character of a representation is always complex, i.e. \(\CC^n\). The real representations \(\RR^n\) and the pseudoreal representations, which include the quaternionic ones \(\HHH^{n/2}\), may be interpreted as ordinary complex representations \(\CC^n\) with an extra "structure map" \(j\) acting on the representation that is antilinear (differing from a linear map by an extra complex conjugation in a defining "scalar linearity" condition) and that commutes with the action of the group.



Real representations are those whose structure map obeys \(j^2=+1\) while the pseudoreal (including quaternionic) representations are those that obey \(j^2=-1\). At any rate, the representation may always be viewed as a "complex representation with some extra structure". For \(j^2=+1\), the structure map allows us to prove that there is a basis in which all the matrices are real; for \(j^2=-1\), we may prove that all the matrices representing the group elements may be organized into \(2\times 2\) blocks \(a+ b\sigma_y\) where \(a,b\in\CC\) and these blocks effectively represent \(1\times 1\) quaternionic entries \(a+jb\).



Real numbers and quaternions are just "cherries added on a fundamental pie" and the fundamental pie is always complex. It's not smaller and it's not larger. At the end, this fundamental position of complex numbers boils down to the fundamental theorem of algebra: every algebraic equation of \(n\)-th degree has \(n\) roots. But this theorem only holds for \(\CC\).



While the quaternions as components of a state vector were just "redundant" but non-lethal, octonions \(\OO\) would be lethal as matrix entries of operators because octonions are not associative (they break the rule \((ab)c=a(bc)\)) while the matrices – something identified with observables and evolution operators etc. – have to be associative e.g. because the evolution is associative.



You could try to modify \(\CC\) in a different way – for example, you could try to pick all the "rational complex numbers". This would also be bad, at least in theories with a continuous time coordinate. In some not-quite-physical toy models, the amplitudes could happen to be rational for "rational questions" but it's an extra coincidence, or an "extra structure", and it doesn't hurt if you simply use wave functions in \(\CC^n\).



Paradoxically enough, the most tolerable "number system" in which you could try to pick your state vector are deeply esoteric systems such as the so-called \(p\)-adic numbers. Quantum mechanics based on such numbers could obey some consistency rules but it would certainly be very different from the theories we use to describe Nature around us.



Linearity of evolution operators



Schrödinger's equation is linear in the wave function. This also implies that the finite-time evolution operators are linear:\[



\ket{\psi(t_1)} = U(t_1,t_0) \ket{\psi(t_0)}



\] Could we make the future wave function depend on the initial wave function in a nonlinear way? We could try but we would quickly run into some serious trouble. What kind of trouble?



Quantum mechanics and any other "at least remotely similar" hypothetical cousin of it describes the state "A or B", with some probabilities, as a superposition\[



\ket{\psi(t_0)} = c_A \ket A + c_B \ket B



\] Assume that someone may "perceive" whether the state of the physical system at time \(t_0\) is A or B; the "A or B" information is a legitimate information that may split consistent histories. Without a loss of generality, imagine that she learns that the state is A. Such a state will evolve into \(c_A \cdot U(t_1,t_0)\ket A\) at time \(t_1\). Similarly for B.



Now, it's important that her consciousness or the absence thereof remains undetectable. After all, no one has ever experimentally demonstrated whether women have consciousness much like men. ;-) And it's true for men, too. It's important that someone's "conscious" learning about the result of a measurement doesn't modify the system in any further way. The procedure needed to measure may impact the measured physical system of interest; however, the mental processes that this measurement causes remain subjective and inconsequential for the rest of the world. We don't want a qualitative "wall" separating conscious and unconscious objects or subjects. Observers are dull physical systems, too.



We're really discussing "Wigner's friend" scenario here. It's important that Wigner is allowed to ignore the "A or B" realization and continue to work with the whole initial state \(\ket{\psi(t_0)}\) above. Because the evolution operator is linear, this state evolves to\[



\ket{\psi(t_1)} = c_A U(t_1,t_0) \ket A + c_B U(t_1,t_0) \ket B.



\] That's great because these two terms (and it could work for many terms, too) are sharply separated from one another. Wigner may calculate the probability of a property at time \(t_1\) and there's a chance that the "A or B perception" at time \(t_0\) only has a tolerable impact on Wigner's predictions: it suppresses the history with A and B at \(t_0\) by their probabilities \(p(A),p(B)\), respectively.



If the evolution operator were nonlinear, Wigner would get various terms that depend both on \(c_A\) and \(c_B\), e.g. that would be proportional to \(c_A^m c_B^n\) with some positive powers. These terms would be there and nonzero if he used the full wave function with both possibilities; but if he accepted that his female friend made a measurement at time \(t_0\), they would disappear because \(c_A^m c_B^n=0\) if either \(c_A=0\) or \(c_B=0\)! So he would get different predictions depending on the question whether his female friend "perceived" something or not.



In other words, souls and ghosts would become physical and they would start to fly everywhere. This is lethal for a candidate theory of mutated quantum mechanics not only because we dislike souls and ghosts. It's lethal because the measurement – that would tangibly affect Wigner's predicted probabilities – could occur at huge distances, at a spacelike separation, and the influence proved above would be a genuine, detectable, faster-than-light signal that would demonstrably violate Einstein's special theory of relativity. We would enable not only souls and ghosts; we would enable superluminal voodoos. You should understand that this would lead to real trouble in the predicted phenomena which is a genuine, objective problem with a candidate theory; your unfamiliarity with a mathematical framework to describe Nature (quantum mechanics) is not a genuine problem, it is just your subjective, psychological problem.



So we have to keep the alternatives that may decohere from each other separated, even after some extra evolution in time; linearity is needed for that. Because the evolution operator \(U(t_1,t_0)\) is a linear operator on the Hilbert space, so is its \(t_1\) derivative near \(t_1\to t_0\) – and it's the Hamiltonian that enters Schrödinger's or Heisenberg's equations (up to a factor of \(i\hbar\)). So the Hamiltonian has to be a linear operator, too.



Similarly, we may see that all other observables representing Yes/No questions have to be linear operators. These linear Hermitian projection operators \(P\) are operators of the type that Wigner's female friend actually applied at time \(t_0\) to simplify her further thinking about the system (the "collapse" of the wave function). If the operator were not linear, one would get a similar interference between the possibilities that should be mutually exclusive.



The Yes/No operators have to be projection operators, \(P^2=P\) – yes, I started to drop the silly hats at some moment, I hope that you survived that (everything in Nature has hats and we should, on the contrary, invent bizarre accents for things that aren't operators, to emphasize that they're not fundamental physical quantities!) – because we want their eigenvalues to be \(0\) and \(1\). Also, we need \(P^\dagger=P\) because we want all the eigenvectors with the \(0\) eigenvalue to be orthogonal to (i.e. mutually exclusive with) those with the \(1\) eigenvalue.



Yes/No operators must be represented by linear Hermitian projection operators. Similarly, operators such as \(X\) are linear Hermitian operators because they may be constructed out of the Yes/No operators by the following sums:\[



X = \sum_i X_i P_{X=X_i}^{0/1:\rm No/Yes}.



\] Note that this formula doesn't really depend on any conventions in quantum mechanics. It just says that the value of \(X\) is the value of \(X_i\) of the only allowed (eigen)value of the coordinate for which the projector \(P_{X=X_i}=1\); the other projection operators are effectively equal to zero.



Fine. We see that all observables with real measurable values are represented by linear Hermitian operators acting on a complex Hilbert space.



Probabilities as squared amplitudes



Born's rule tells you that the probabilities – the only kind of numbers that quantum mechanics may predict in the most general situations – are calculated from the complex numbers, the amplitudes, by squaring their absolute values. We have\[



p_i = |c_i|^2, \quad c_i\in \CC.



\] That's obviously another favorite target of the rapists I mentioned at the beginning. Why wouldn't we use \(|c_i|\) or, more naturally, \(|c_i|^4\) or any other function of the amplitudes (perhaps not necessary a phase-independent function)? If you pick the fourth power, for example, you may surely get an equally good cousin of quantum mechanics – or mutated quantum mechanics – and our Nature has just picked the second power due to some random subjective choices, hasn't it?



Not really.



When you decompose a wave function into some components that are eigenvectors of \(L\)\[



\ket\psi = \sum_i c_i \ket{\ell_i},\quad L\ket{\ell_i} = L_i \ket{\ell_i},



\] we want to say that the probability that \(L=L_i\) is equal to \(p_i=|c_i|^2\), assuming that the basis of vectors \(\ket{\ell_i}\) is orthonormal. We need it for the total probability of all possibilities, \(\sum_i p_i\), to be conserved. So if it is 100 percent at the beginning, it is 100 percent at the end.



This conservation law follows from \(H\) that is a Hermitian operator as we have already demonstrated; the evolution operators are unitary, \(UU^\dagger=U^\dagger U = {\bf 1}\), as a result. And what is conserved is \(\braket\psi\psi\) which may be proved to be equal to \(\sum_i |c_i|^2\) by pure algebra i.e. without any assumptions about physics. There can't be an equally general sum that is conserved in the general situation so the two sums must be functions of one another and \(p_i=|c_i|^2\) follows from that (up to the freedom to insert an illogical universal multiplicative coefficient into this relation).



This argument holds for any Hermitian operator \(L\) and the corresponding decomposition of the state vectors into its eigenvectors. The probabilities have to be given by the squared amplitudes, otherwise the "total probability of all mutually excluding alternatives" can't be conserved.



You could try to keep on struggling and proposing various creative loopholes. For example, you could say that this whole quantum mechanics is based on "unitary evolution operators" and the unitary groups just happen to have a bilinear (well, sesquilinear) invariant given by the complexified Pythagorean theorem. But there may be other groups that have higher-order invariants, right?



Well, there exist groups with higher-order invariants but these invariants aren't guaranteed to be positive so they can't play the role of probabilities. This is enough to kill these possibilities but there are actually many other ways to kill it. We simply want simple enough state vectors – energy eigenstates – to evolve simply. The change of the phase with time is what this change has to look like.



There are various other ways to attack this loophole but I don't want to spend too much with it. You should just realize that in proper quantum mechanics – whatever the Hamiltonian is: non-relativistic quantum mechanics, quantum field theory, string theory, whatever you like – pretty much any "physical transformation" of the physical system (evolution in time, translation in space, rotation, parity, and so on) is expressed by a unitary operator on the Hilbert space. If you want to change something about this rule, you are really building an entirely new theory from scratch.



Fixing the norm of the state vector along the way



Another group of "anything goes" rapists could propose a universal cure for all the non-unitary, nonlinear, and other theories. They could say that the only "constraint" we faced was the condition that the sum of probabilities had to remain 100 percent. Can't we just rescale the wave function – that may evolve according to any non-unitary, non-linear equation of motion – at each moment to manually guarantee that the sum of probabilities remains equal to 100 percent?



We may do it but we will run into conflicts with other basic physical or logical requirements that these rapists might be willing to overlook but that are paramount, anyway. What do I mean?



Imagine that you start with \(\ket{\psi(t_0)}\) and evolve it to\[



k(t_1,t_0)\cdot U(t_1,t_0) [ \ket{\psi(t_0)} ]



\] where I wrote the ket vector as an argument in the square brackets to indicate that the operator \(U\) may be nonlinear. Also, the added coefficient \(k\) is there to keep the total probability equal to 100 percent according to your own formula for the total probability, one that may differ from Born's rule.



That may look fine to you but we resuscitate ghosts and voodoo again. The required "renormalization constant" \(k(t_1,t_0)\) actually has to depend on the initial state as well if it's able to preserve the total probability in the general case – it was fraudulent to suppress this dependence. And if the initial wave function describes the "A or B" state, this \(k\) will inevitably depend on \(c_A\) and \(c_B\) again. The possibilities "A or B" will refuse to split in the final expression for \(\ket{\psi(t_1)}\). Again, it will be important whether Wigner's female friend at \(t_0\) "eliminated" the other possible outcomes or not. The eliminated outcomes will still affect the outcomes for the moment \(t_1\) that remain viable; equivalently, consciousness will become physically measurable and it will violate the laws of special relativity again.



So it's important not to attempt to "renormalize" the formulae for probabilities by additional ad hoc fudge factors. One may argue that such fudge factors would damage the very logical structure of the theory but even if you were OK with it, you will ultimately see that your alternative theory allows the female observers to send superluminal signals by the "power of her will" (a superluminal form of telekinesis combined with telepathy) and violate the rules of relativity which seem to hold, according to observations and a robust symmetry principle extracted from all these observations.



So the probabilities have to be what they are according to the unadjusted formulae and because their sum has to remain equal to 100 percent and because the bilinear invariants are the only universally non-negative (for all states) invariants one may find for general classes of transformations, it follows that all "physical transformations" are encoded by unitary linear transformations on the Hilbert space and the squared complex amplitudes have to be interpreted as probabilities.



I feel that I have forgotten some other "popular" ways to rape quantum mechanics. But it's been enough so far and if I recall what I have forgotten, I will update this blog entry.