Lecture 6

Today, we first talked about POVM measurements. The second part of the lecture went over the basics of the quantum circuit model. POVM stands for positive operator valued measure. The outcomes of such a measurement are indexed by positive operators, and the word "measure" here is used because there can conceivably be an infinite number of such outcomes, in which case you need to have a measure on positive operators over which you integrate. In this lecture, I'll only talk about the case where there are a finite number of possible outcomes, over a finite dimensional quantum space. In this case, the elements are Hermitian matrices (finite dimensional positive operators), and there are no integrals, only a sum.

Before we can talk about POVM's, we should probably talk about projective measurements where some of the projections are on subspaces of dimension higher than 1. We've been using such measurements implicitly, but I haven't given a precise mathematical description of them yet. Suppose we have a set of projectors onto subspaces, Π1, Π2, ..., Πk, with the property that these subspaces are orthogonal, so

Πi Πj = 0 if i ≠ j,
and these subspaces span the entire space, that is,
Σ i=1k Πi = I.
Then there is a projective measurement associated with these subspaces which takes a quantum state | ψ ⟩ to the state
Πi | ψ ⟩ / ⟨ ψ | Πi | ψ ⟩½
with probability
⟨ ψ | Πi | ψ ⟩.
That is, it projects the state | ψ ⟩ onto the i'th subspace with probability proportional to the squared length of the projection onto that subspace. It should be clear that the case where each subspace has dimension 1 corresponds to measuring with respect to an orthonormal basis, the best known case of quantum measurements.

To motive POVM's, let's consider an example. Suppose we have a qubit, so its state is a unit vector in the space with basis | 0 ⟩ and | 1 ⟩. We can embed this space in a larger space by simply adding a number of extra basis vectors. Let's add the basis vector | 2 ⟩. Now, there are orthonormal bases of this 3-dimensional space where none of the basis vectors line up with the space corresponding to the original qubit. What happens when we choose one of these bases for a projective measurement? What effect does this measurement have on the original qubit? Let's look at the example given by the basis:

√2/√3 | 0 ⟩ + 1/√3 | 2 ⟩
−1/√6 | 0 ⟩ + 1/√2 | 1 ⟩ + 1/√3 | 2 ⟩
−1/√6 | 0 ⟩ − 1/√2 | 1 ⟩ + 1/√3 | 2 ⟩
It's easy to check that these are orthonormal. When we take our vector | ψ ⟩ and measure it using the above basis, what happens? The probability of the first outcome is
| ( √2/√3 ⟨ 0 | + 1/√3 ⟨ 2 | ) | ψ ⟩ |2 = | √2/√3 ⟨ 0 | ψ ⟩ |2
since ⟨ 2 | is orthogonal to | ψ ⟩. If we define the unnormalized quantum states
| e1 ⟩ = √2/√3 | 0 ⟩
| e2 ⟩ = −1/√6 | 0 ⟩ + 1/√2 | 1 ⟩
| e3 ⟩ = −1/√6 | 0 ⟩ − 1/√2 | 1 ⟩
we similarly see that the probability of outcome i is
| ⟨ ei | ψ ⟩ |2.
Now, suppose we have a number of these unnormalized vectors | ei ⟩, and we ask when does the above rule for choosing probabilities of outcomes possibly lead form a measurement. A necessary condition is that the probabilities add to 1, that is,
Σi=1k | ⟨ ei | v ⟩ |2 = 1
for all unit vectors | v ⟩ in our quantum state space. This condition is equivalent to
Σi=1k ⟨ v | ei ⟩⟨ ei | v ⟩ = 1
and moving the sum inside the ⟨ v | ⋅ | v ⟩ we have
⟨ v | ( Σi=1k | ei ⟩⟨ ei | ) | v ⟩ = 1
for all unit vectors | v ⟩. However, any Hermitian matrix satisfying ⟨ v | M | v ⟩ = 1 for all unit vectors | v ⟩ must be the identity matrix (one can easily check that all its eigenvalues are 1). Thus, we have the necessary condition
Σi=1k | ei ⟩⟨ ei | = I.
It turns out that this is a necessary and sufficient for a collection of unnormalized vectors | ei ⟩ to be the special kind of POVM all of whose elements are rank 1. We next show that if we have a collection of such | ei ⟩, we can achieve the above outcome probabilities by a projective measurement in a higher dimensional space.

Suppose we have k unnormalized quantum states | ei ⟩ in n dimensions such that

Σi=1k | ei ⟩⟨ ei | = I.
Let's consider the k × n matrix M obtained by putting these vectors in the columns of a matrix. The entry M(i,j) is the i'th coordinate of | ej ⟩, or ⟨ i | ej ⟩ The matrix thus looks like
⟨ 1 | e1 ⟨ 1 | e2 ⟨ 1 | e3 ... ⟨ 1 | ek
⟨ 2 | e1 ⟨ 2 | e2 ⟨ 2 | e3 ... ⟨ 2 | ek
 
 
⟨ n | e1 ⟨ n | e2 ⟨ n | e3 ... ⟨ n | ek
Now, I'd like to claim that all of the rows of this matrix are orthonormal. Let's consider the inner product of row i and row i'. We have that this is
Σj=1k ⟨ i' | ej ⟩⟨ ej | i ⟩
However, we can move the sum into the middle, where we obtain the identity matrix, because of the condition on the | ej ⟩. We thus have the inner product of row i and i' is
⟨ i' | ( Σj=1k | ej ⟩⟨ ej | ) | i ⟩ = ⟨ i' | I | i ⟩ = δ(i',i)
We now have a set of n orthonormal rows in a k-dimensional space. By using Gram-Schmidt, we can extend these to a set of k orthonormal rows. Since any square matrix whose rows are orthonormal is unitary, and thus has orthonormal columns, the columns of this new k × k matrix correspond to a projective measurement. If this measurement is restricted to act on the n-dimensional space given by the first n basis vectors, the column vectors are projected onto the subspace spanned by the first n basis vectors, and this becomes the POVM given by the | ej ⟩ that we started with.

Thus, we have discovered that if we start with any projective measurement with rank 1 projectors on a large space, and restrict to a smaller space, it can be expressed as a POVM given by a set of unnormalized vectors | ei ⟩ with the condition

Σi=1k | ei ⟩⟨ ei | = I.
We have not yet said what happens when the projective measurement over the larger space is made up not of rank 1 projectors, but of larger rank projectors. How do this change affect the POVM? I am not going to go over the proof in detail, but it follows essentially the same lines as the proof above. The statement of the theorem is that any set of k Hermitian matrices Ei satisfying the condition
Σi=1k Ei = I
can form a POVM measurement. If this measurement is performed on a quantum state | &psi &rang, the probability of the i'th outcome is
⟨ ψ | Ei | ψ ⟩
What actually happens to the quantum state? I am not telling you now, but I'm assigning a couple of homework problems related to this question. I will say that specifying the Ei is not enough to completely determine the residual quantum state after the measurement.

In the rest of the lecture, I started giving the circuit model of quantum computation. I'm not going to go into great detail here. The model's setting is the tensor product space of n qubits. The input (if you don't design a circuit that specifically depends on the input) is given by initializing the first m qubits of the circuit to a sequence of | 0 ⟩'s and | 1 ⟩'s, depending on the input in binary. We can assume that the remaining n−m qubits are initialized to the state | 0 ⟩. The output is obtained by measuring the state of the computer in a canonical basis. The output is thus a sample from the probability distribution where j appears with probability | αj |2, where the final state of the computer is

Σj=02n-1 αj | j ⟩.
Since the output is probabilistic, we must say that the computer succeeds in performing the computation if the output gives the right answer with high probability (say ≥ 2/3). If this is the case, we can obtain the right answer with high probability by repeating the computation. In practice, it is sufficient if the quantum computer outputs a state which a classical computer can efficiently use to obtain the answer with high probability. Theoretically, this last classical computation could in principle be performed on a quantum computer, so for models of computation it is sufficient to require the quantum computer to output the answer.

The last and most important piece of the model is the actual computation. We will require the computation to be performed by applying a fixed sequence of 1- and 2-qubit gates to the computer. We choose 2-qubit gates for this threshold because if we only use one-qubit gates, we can never make two qubits interact, and thus we cannot achieve much computation. And as we will see in the next lectures, 3-qubit gates can be efficiently simulated by a sequence of 2-qubit gates, so adding them to the model only reduces the number of gates by a constant multiplicative factor, and 3-qubit gates are much more difficult to achieve experimentally, as well as being less tractable theoretically.