3 Quantum Information Theory: Basic Ingredients

3.1 Density Matrices

So rather than trying to follow along completely with the narrative, I’ll begin now trying to offer explanations of key concepts and notations that the paper might take for granted. Quantum mechanic notations I think can be thought of mathematically as linear algebra applied to matrices subject to some constraints describing an analogous property to probability of some outcomes upon measurement — but with the probability expanded to allow for +/- values as well as real and imaginary components (a complex scalar). As the number of potential outcomes incorporated into the analysis grows so does the dimensionality which can be described by a Hilbert space vector of dimensionality. We can incorporate or combine different frames of reference using a tensor product operation. The states are typically represented using the Bra-Ket notation <Bra| and |Ket>, where <Bra| represents a row vector and |Ket> a column vector. We have the ability to translate from a Bra to a Ket vector and vice versa using a conjugate-transpose operation (basically a transpose operation coupled with a complex conjugation that multiplies imaginary portion of scalars by (-1)). I am not positive but assume BraKet is an attempt at a pun on the term bracket which if so I fully support. We can combine the vectors using either an inner product (dot product) abbreviated as <Bra|Ket> or an outer product (tensor product) abbreviated as |Ket><Bra|.

Demonstration of conversion between Bra and Ket vectors using the conjugate-transpose operation (basically a transpose operation coupled with a complex conjugation that multiplies imaginary portion of complex scalars by (-1)).

Demonstration of Bra Ket vector operations

This BraKet convention can be extended to incorporate an intermediate square matrix in the inner product using standard matrix multiplication, such as:

Demonstration of Bra Ket inner product incorporating an intermediate square matrix, note that the output is still a complex scalar even with the incorporation of the intermediate matrix.

The tensor product operation (⊗) can also be extended from vectors to matrices input such as:

Demonstration of tensor product operation applied to matrices. Note these matrices don’t have to be square.

The paper makes use of this operation to demonstrate that if we have two Hilbert spaces of vector dimensions A(i,j) and B(k,l) we can combine the two spaces via tensor product to get a Hilbert space matrix of dimensions N(i*k, j*l).

I’m trying to cover a lot of the fundamentals of the linear algebra quantum system notation conventions here and in the process getting a touch removed from the paper’s narrative, let’s try to jump back in by addressing the theme of this section 3.1, density matrices. I confess I ended up pulling up some supplemental wikipedia reading to follow along, and I think in the process found a pretty intuitive textual definition of the difference between a pure state and a density matrix. In each case it is possible to combine multiple quantum substates, however in the pure state the substates are combined using the tensor product in a quantum superposition, and in a mixed state’s density matrix the states are combined as a classical probability statistical ensemble of states (remember a classical probability is a real scalar subject to 0<=p<=1 and the quantum equivalent is an imaginary scalar q with p = |q|²<=1, and in both cases ∑p=1 (actually come to think of it I don’t think have come across a term for the quantum equivalent of a classic probability, on the off chance none exists let’s call it here the quability, got a nice ring to it)).

(Via Eq 3.10, 3.15+) Demonstration of derivation of density matrices for pure and mixed states, pi is a classical probability with ∑p=1 .

The paper makes use of the trace operation (Tr) as well so I’ll expand on it quickly. When we have a combination of quantum substates, say state A which represents a qubit and state B which could be a second qubit or alternatively could represent the rest of the universe, by taking a trace in the B basis we’re effectively saying that we can choose to ignore the B basis and drop it altogether from our evaluation, which is a fortunate ability because I imagine nailing down the state space of the rest of the universe might be a challenge for most scientists. Looking at it from a linear algebra standpoint, a trace operation for a matrix can be derived from the sum of its diagonal elements.

Note that these density matrices are described in the paper as subject to the constraints of hermitian and positive semi-definite, let’s quickly define these terms. Hermitian matrices are square dimensions with nondiagnal entries such that the value of the complex scalar in row i and column j is equal to the complex conjugate of the value in row j and column i, where a complex conjugate just means that the imaginary part of the number is multiplied by -1. To be honest I’m not sure why this matters or how this constraint comes about so let’s just go with it. As for positive semi-definite, I think the easiest way to define this is the constraint that the eigenvalues of the matrix are positive. Of course this opens up a whole can of worms, now we have to address eigenvalues. Here’s my layman’s sketch (which I am basing heavily on the clear writeup in Wolfram mathworld): given a square matrix A, there exists a scalar eigen value λ and a corresponding (column) eigen vector X such that AX = λX. Thus λ is a kind of characteristic feature of a matrix, and it turns out there will be several such eigen values λi and corresponding eigen vectors Xi with quantity based on the size of the matrix. Further, it is possible to transform (e.g. rotate thought it’s dimensions) the basis of hermitian matrix A by applications derived from the set of eigenvectors to result in a matrix of same dimensions but whose values are all 0 save for the diagonals whose values are the set of eigen values λi, a process known as diagonalization.

When Witten describes the purification of a density matrix, this diagonalization property comes into play in the proof that we can create a process for incorporation of an orthogonal state space derived for a specific mixed state ρA such that when such unitary transformation is applied it transforms what started as a mixed state ρA to a pure state ψAB.

Eq 3.13 — Demonstration of the purification of the state A by application of an orthogonal state B.

Before closing this section one further note on this Bra Ket notation, which I’ll include as kind of an aside and because it’s relevant and fundamental to many of my earlier writings on quantum computation even though it’s not directly addressed by Witten. For the special case of quantum computing with a qubit measured in the binary 0/1 basis (as opposed to say the +/- basis for the instance of another common basis) and thus a 2 dimensional Hilbert space, the inner product of the state descriptor Bra vector and the qubit measurement identifier vector Ket can be combined between multiple qubits using the tensor product operation for a joint state of multiple qubits such as we might get as we scale up our quantum computers. Here the Ket |0> refers to a single qubit measurement generating a 0, a |1> refers to a measurement generating a 1, with after combining two qubits for instance the Ket |00> indicates measurement on both qubits produces a 0, |01> indicating the first qubit measures a 0 and the second qubit a 1, and etc for |10> and |11>, with this combined Hilbert space dimensions of 2^n where n is the number of qubits. We can describe the state of a single qubit’s superposition as |ψ> = a|0> + b|1>, where a and b are complex scalars subject to the constraint |a|²+|b|² = 1. Combining multiple qubits thus entails:

Demonstration of deriving state space for a pair of qubits. Note that this can be further extended to additional qubits via tensor product, and the summed constraint to unity of the qubit states representing a sum of their respective probabilities for generating the corresponding measurements. Note that as we transition from a pure state to a mixed state’s density matrix (such as after incorporation of some channel of error to our state), the summed unity constraint =1 changes to <=1, and a value of zero would correspond to a fully decohered / maximally mixed state, this value will always shrink for real qubits with increasing time t with a rate based on the architecture, although its decline can be slowed by incorporating error correction.