\section{Quantum verses classical physics} \label{s:qvc} In order to think about quantum information theory, let us first state the principles of non-relativisitic quantum mechanics, as follows (Shankar 1980). \begin{enumerate} \item The state of an isolated system $\cal Q$ is represented by a vector $\ket{\psi(t)}$ in a Hilbert space. \item Variables such as position and momentum are termed observables and are represented by Hermitian operators. The position and momentum operators $X,P$ have the following matrix elements in the eigenbasis of $X$: \begin{eqnarray*} \bra{x} X \ket{x'} &=& x \delta (x-x') \\ \bra{x} P \ket{x'} &=& -i \hbar \delta' (x-x') \end{eqnarray*} \item The state vector obeys the Schr\"odinger equation \beq i \hbar \frac{d}{dt} \ket{\psi(t)} = {\cal H} \ket{\psi(t)} \label{Sch} \eeq where ${\cal H}$ is the quantum Hamiltonian operator. \item Measurement postulate. \end{enumerate} The fourth postulate, which has not been made explicit, is a subject of some debate, since quite different interpretive approaches lead to the same predictions, and the concept of `measurement' is fraught with ambiguities in quantum mechanics (Wheeler and Zurek 1983, Bell 1987, Peres 1993). A statement which is valid for most practical purposes is that certain physical interactions are recognisably `measurements', and their effect on the state vector $\ket{\psi}$ is to change it to an eigenstate $\ket{k}$ of the variable being measured, the value of $k$ being randomly chosen with probability $P \propto |\left< k \right. \ket{\psi}|^2$. The change $\ket{\psi} \rightarrow \ket{k}$ can be expressed by the projection operator $(\ket{k}\bra{k})/\left< k \right. \ket{\psi}$. Note that according to the above equations, the evolution of an isolated quantum system is always {\em unitary}, in other words $\ket{\psi(t)} = U(t) \ket{\psi(0)}$ where $U(t) = \exp(-i \int {\cal H} dt / \hbar)$ is a unitary operator, $U U^{\dagger} = I$. This is true, but there is a difficulty that there is no such thing as a truly isolated system (i.e. one which experiences no interactions with any other systems), except possibly the whole universe. Therefore there is always some approximation involved in using the Schr\"odinger equation to describe real systems. One way to handle this approximation is to speak of the system $\cal Q$ and its environment $\cal T$. The evolution of $\cal Q$ is primarily that given by its Schr\"odinger equation, but the interaction between $\cal Q$ and $\cal T$ has, in part, the character of a measurement of $\cal Q$. This produces a non-unitary contribution to the evolution of $\cal Q$ (since projections are not unitary), and this ubiquitous phenomenon is called {\em decoherence}. I have underlined these elementary ideas because they are central in what follows. We can now begin to bring together ideas of physics and of information processing. For, it is clear that much of the wonderful behaviour we see around us in Nature could be understood as a form of information processing, and conversely our computers are able to simulate, by their processing, many of the patterns of Nature. The obvious, if somewhat imprecise, questions are \begin{enumerate} \item ``can Nature usefully be regarded as essentially an information processor?'' \item ``could a computer simulate the whole of Nature?'' \end{enumerate} The principles of quantum mechanics suggest that the answer to the first quesion is {\em yes}\footnote{This does not necessarily imply that such language captures everthing that can be said about Nature, merely that this is a useful abstraction at the descriptive level of physics. I do not believe any physical `laws' could be adequate to completely describe human behaviour, for example, since they are sufficiently approximate or non-prescriptive to leave us room for manoeuvre (Polkinghorne 1994).}. For, the state vector $\ket{\psi}$ so central to quantum mechanics is a concept very much like those of information science: it is an abstract entity which contains exactly all the information about the system $\cal Q$. The word `exactly' here is a reminder that not only is $\ket{\psi}$ a complete description of $Q$, it is also one that does not contain any extraneous information which can not meaningfully be associated with $\cal Q$. The importance of this in quantum statistics of Fermi and Bose gases was mentioned in the introduction. The second question can be made more precise by converting the Church-Turing thesis into a principle of physics, {\em Every finitely realizible physical system can be simulated arbitrarily closely by a universal model computing machine operating by finite means.} This statement is based on that of Deutsch (1985). The idea is to propose that a principle like this is not derived from quantum mechanics, but rather underpins it, like other principles such as that of conservation of energy. The qualifications introduced by `finitely realizible' and `finite means' are important in order to state something useful. The new version of the Church-Turing thesis (now called the `Church-Turing Principle') does not refer to Turing machines. This is important because there are fundamental differences between the very nature of the Turing machine and the principles of quantum mechanics. One is described in terms of operations on classical bits, the other in terms of evolution of quantum states. Hence there is the possibility that the universal Turing machine, and hence all classical computers, might not be able to simulate some of the behaviour to be found in Nature. Conversely, it may be physically possible (i.e. not ruled out by the laws of Nature) to realise a new type of computation essentially different from that of classical computer science. This is the central aim of quantum computing. \subsection{EPR paradox, Bell's inequality} \lab{s:EPR} In 1935 Einstein, Podolski and Rosen (EPR) drew attention to an important feature of non-relativistic quantum mechanics. Their argument, and Bell's analysis, can now be recognised as one of the seeds from which quantum information theory has grown. The EPR paradox should be familiar to any physics graduate, and I will not repeat the argument in detail. However, the main points will provide a useful way in to quantum information concepts. The EPR thought-experiment can be reduced in essence to an experiment involving pairs of two-state quantum systems (Bohm 1951, Bohm and Aharonov 1957). Let us consider a pair of spin-half particles $A$ and $B$, writing the ($m_z = +1/2$) spin `up' state $\ket{\uparrow}$ and the ($m_z = -1/2$) spin `down' state $\ket{\downarrow}$. The particles are prepared initially in the singlet state $(\ket{\uparrow}\ket{\downarrow} - \ket{\downarrow}\ket{\uparrow})/ \sqrt{2}$, and they subsequently fly apart, propagating in opposite directions along the $y$-axis. Alice and Bob are widely separated, and they receive particle $A$ and $B$ respectively. EPR were concerned with whether quantum mechanics provides a complete description of the particles, or whether something was left out, some property of the spin angular momenta ${\bf s}_A,{\bf s}_B$ which quantum theory failed to describe. Such a property has since become known as a `hidden variable'. They argued that something was left out, because this experiment allows one to predict with certainty the result of measuring any component of ${\bf s}_B$, without causing any disturbance of $B$. Therefore all the components of ${\bf s}_B$ have definite values, say EPR, and the quantum theory only provides an incomplete description. To make the certain prediction without disturbing $B$, one chooses any axis $\eta$ along which one wishes to know $B$'s angular momentum, and then measures not $B$ but $A$, using a Stern-Gerlach apparatus aligned along $\eta$. Since the singlet state carries no net angular momentum, one can be sure that the corresponding measurement on $B$ would yield the opposite result to the one obtained for $A$. The EPR paper is important because it is carefully argued, and the fallacy is hard to unearth. The fallacy can be exposed in one of two ways: one can say either that Alice's measurement does influence Bob's particle, or (which I prefer) that the quantum state vector $\ket{\phi}$ is not an intrinsic property of a quantum system, but an expression for the information content of a quantum variable. In a singlet state there is mutual information between $A$ and $B$, so the information content of $B$ changes when we learn something about $A$. So far there is no difference from the behaviour of classical information, so nothing surprising has occurred. A more thorough analysis of the EPR experiment yields a big surprise. This was discovered by Bell (1964,1966). Suppose Alice and Bob measure the spin component of $A$ and $B$ along different axes $\eta_A$ and $\eta_B$ in the $x$-$z$ plane. Each measurement yields an answer $+$ or $-$. Quantum theory and experiment agree that the probability for the two measurements to yield the same result is $\sin^2((\phi_A - \phi_B)/2)$, where $\phi_A$ ($\phi_B$) is the angle between $\eta_A$ ($\eta_B$) and the $z$ axis. However, there is no way to assign {\em local} properties, that is properties of $A$ and $B$ independently, which lead to this high a correlation, in which the results are certain to be opposite when $\phi_A = \phi_B$, certain to be equal when $\phi_A = \phi_B + 180^{\circ}$, and also, for example, have a $\sin^2(60^{\circ}) = 3/4$ chance of being equal when $\phi_A - \phi_B = 120^{\circ}$. Feynman (1982) gives a particularly clear analysis. At $\phi_A - \phi_B = 120^{\circ}$ the highest correlation which local hidden variables could produce is $2/3$. The Bell-EPR argument allows us to identify a task which is physically possible, but which no classical computer could perform: when repeatedly given inputs $\phi_A$, $\phi_B$ at completely separated locations, respond quickly (i.e. too quick to allow light-speed communication between the locations) with yes/no responses which are perfectly correlated when $\phi_A = \phi_B + 180^{\circ}$, anticorrelated when $\phi_A = \phi_B$, and more than $\sim 70\%$ correlated when $\phi_A - \phi_B = 120^{\circ}$. Experimental tests of Bell's argument were carried out in the 1970's and 80's and the quantum theory was verified (Clauser and Shimony 1978, Aspect {\em et. al.} 1982; for more recent work see Aspect (1991), Kwiat {\em et. al.} 1995 and references therein). This was a significant new probe into the logical structure of quantum mechanics. The argument can be made even stronger by considering a more complicated system. In particular, for three spins prepared in a state such as $(\ket{\uparrow}\ket{\uparrow} \ket{\uparrow} + \ket{\downarrow}\ket{\downarrow}\ket{\downarrow}) / \sqrt{2}$, Greenberger, Horne and Zeilinger (1989) (GHZ) showed that a single measurement along a horizontal axis for two particles, and along a vertical axis for the third, will yield with certainty a result which is the exact opposite of what a local hidden-variable theory would predict. A wider discussion and references are provided by Greenberger {\em et. al.} (1990), Mermin (1990). The Bell-EPR correlations show that quantum mechanics permits at least one simple task which is beyond the capabilities of classical computers, and they hint at a new type of mutual information (Schumacher and Nielsen 1996). In order to pursue these ideas, we will need to construct a complete theory of quantum information. \section{Quantum Information} Just as in the discussion of classical information theory, quantum information ideas are best introduced by stating them, and then showing afterwards how they link together. Quantum communication is treated in a special issue of {\em J. Mod. Opt.}, volume 41 (1994); reviews and references for quantum cryptography are given by Bennett {\em et. al.} (1992); Hughes {\em et. al.} (1995); Phoenix and Townsend (1995); Brassard and Crepeau (1996); Ekert (1997). Spiller (1996) reviews both communication and computing. \subsection{Qubits} The elementary unit of quantum information is the {\em qubit} (Schumacher 1995). A single qubit can be envisaged as a two-state system such as a spin-half or a two-level atom (see fig. 12), but when we measure quantum information in qubits we are really doing something more abstract: a quantum system is said to have $n$ qubits if it has a Hilbert space of $2^n$ dimensions, and so has available $2^n$ {\em mutually orthogonal} quantum states (recall that $n$ classical bits can represent up to $2^n$ different things). This definition of the qubit will be elaborated in section \ref{s:qdc}. We will write two orthogonal states of a single qubit as $\{ \ket{0}, \ket{1} \}$. More generally, $2^n$ mutually orthogonal states of $n$ qubits can be written $\{ \ket{i} \}$, where $i$ is an $n$-bit binary number. For example, for three qubits we have $\{ \ket{000},\ket{001}, \ket{010}, \ket{011},$ $\ket{100},\ket{101}, \ket{110}, \ket{111} \}$. \subsection{Quantum gates} Simple unitary operations on qubits are called quantum `logic gates' (Deutsch 1985, 1989). For example, if a qubit evolves as $\ket{0} \rightarrow \ket{0}$, $\ket{1} \rightarrow \exp(i\omega t)\ket{1}$, then after time $t$ we may say that the operation, or `gate' \beq P(\theta) = \left( \begin{array}{cc} 1 & 0 \\ 0 & e^{i \theta} \end{array} \right) \eeq has been applied to the qubit, where $\theta = \omega t$. This can also be written $P(\theta) = \ket{0}\bra{0} + \exp(i\theta) \ket{1}\bra{1}$. Here are some other elementary quantum gates: \begin{eqnarray} I &\equiv& \ket{0}\bra{0} + \ket{1}\bra{1} \;\; = \mbox{identity} \\ X &\equiv& \ket{0}\bra{1} + \ket{1}\bra{0} \;\; = \mbox{\sc not}\\ Z &\equiv& P(\pi) \\ Y &\equiv& X Z \\ H &\equiv& \frac{1}{\sqrt{2}}\left[ \rule{0em}{1.3em} \left(\ket{0} + \ket{1}\right)\bra{0} + \left(\ket{0} - \ket{1}\right)\bra{1} \right] \end{eqnarray} these all act on a single qubit, and can be achieved by the action of some Hamiltonian in Schr\"odinger's equation, since they are all unitary operators\footnote{The letter $H$ is adopted for the final gate here because its effect is a {\em Hadamard} transformation. This is not to be confused with the Hamiltonian ${\cal H}$.}. There are an infinite number of single-qubit quantum gates, in contrast to classical information theory, where only two logic gates are possible for a single bit, namely the identity and the logical {\sc not} operation. The quantum {\sc not} gate carries $\ket{0}$ to $\ket{1}$ and vice versa, and so is analagous to a classical {\sc not}. This gate is also called $X$ since it is the Pauli $\sigma_x$ operator. Note that the set $\{ I, X, Y, Z \}$ is a group under multiplication. Of all the possible unitary operators acting on a pair of qubits, an interesting subset is those which can be written $\ket{0}\bra{0}\otimes I + \ket{1}\bra{1}\otimes U$, where $I$ is the single-qubit identity operation, and $U$ is some other single-qubit gate. Such a two-qubit gate is called a ``controlled $U$'' gate, since the action $I$ or $U$ on the second qubit is controlled by whether the first qubit is in the state $\ket{0}$ or $\ket{1}$. For example, the effect of controlled-{\sc not} (``{\sc cnot}'') is \begin{eqnarray} \ket{00} &\rightarrow& \ket{00} \nonumber \\ \ket{01} &\rightarrow& \ket{01} \nonumber \\ \ket{10} &\rightarrow& \ket{11} \nonumber \\ \ket{11} &\rightarrow& \ket{10} \label{cnot} \end{eqnarray} Here the second qubit undergoes a {\sc not} if and only if the first qubit is in the state $\ket{1}$. This list of state changes is the analogue of the truth table for a classical binary logic gate. The effect of controlled-{\sc not} acting on a state $\ket{a}\ket{b}$ can be written $a \rightarrow a$, $b \rightarrow a \oplus b$, where $\oplus$ signifies the exclusive or ({\sc xor}) operation. For this reason, this gate is also called the {\sc xor} gate. Other logical operations require further qubits. For example, the {\sc and} operation is achieved by use of the 3-qubit ``controlled-controlled-{\sc not}'' gate, in which the third qubit experiences {\sc not} if and only if both the others are in the state $\ket{1}$. This gate is named a Toffoli gate, after Toffoli (1980) who showed that the classical version is universal for classical reversible computation. The effect on a state $\ket{a}\ket{b}\ket{0}$ is $a \rightarrow a, b \rightarrow b, 0 \rightarrow a \cdot b$. In other words if the third qubit is prepared in $\ket{0}$ then this gate computes the {\sc and} of the first two qubits. The use of three qubits is necessary in order to permit the whole operation to be unitary, and thus allowed in quantum mechanical evolution. It is an amusing excercise to find the combinations of gates which perform elementary arithmatical operations such as binary addition and multiplication. Many basic constructions are given by Barenco {\em et. al.} (1995b), further general design considerations are discussed by Vedral {\em et. al.} (1996) and Beckman {\em et. al.} (1996). The action of a sequence of quantum gates can be written in operator notation, for example $X_1 H_2 \mbox{\sc xor}_{1,3} \ket{\phi}$ where $\ket{\phi}$ is some state of three qubits, and the subscripts on the operators indicate to which qubits they apply. However, once more than a few quantum gates are involved, this notation is rather obscure, and can usefully be replaced by a diagram known as a quantum network---see fig. 8. These diagrams will be used hereafter. \subsection{No cloning} {\em No cloning theorem:} An unknown quantum state cannot be cloned. This states that it is impossible to generate copies of a quantum state reliably, unless the state is already known (i.e. unless there exists classical information which specifies it). Proof: to generate a copy of a quantum state $\ket{\alpha}$, we must cause a pair of quantum systems to undergo the evolution $U (\ket{\alpha} \ket{0}) = \ket{\alpha} \ket{\alpha}$ where $U$ is the unitary evolution operator. If this is to work for any state, then $U$ must not depend on $\alpha$, and therefore $U (\ket{\beta} \ket{0}) = \ket{\beta} \ket{\beta}$ for $\ket{\beta} \ne \ket{\alpha}$. However, if we consider the state $\ket{\gamma} = (\ket{\alpha} + \ket{\beta})/\sqrt{2}$, we have $U (\ket{\gamma} \ket{0}) = (\ket{\alpha}\ket{\alpha} + \ket{\beta}\ket{\beta})/\sqrt{2} \ne \ket{\gamma}\ket{\gamma}$ so the cloning operation fails. This argument applies to any purported cloning method (Wooters and Zurek 1982, Dieks 1982). Note that any given `cloning' operation $U$ can work on some states ($\ket{\alpha}$ and $\ket{\beta}$ in the above example), though since $U$ is trace-preserving, two different clonable states must be orthogonal, $\left< \alpha \right| \left. \beta \right> = 0$. Unless we already know that the state to be copied is one of these states, we cannot guarantee that the chosen $U$ will correctly clone it. This is in contrast to classical information, where machines like photocopiers can easily copy whatever classical information is sent to them. The controlled-{\sc not} or {\sc xor} operation of equation (\ref{cnot}) is a copying operation for the states $\ket{0}$ and $\ket{1}$, but not for states such as $\ket{+} \equiv (\ket{0} + \ket{1}) / \sqrt{2}$ and $\ket{-} \equiv (\ket{0} - \ket{1}) / \sqrt{2}$. The no-cloning theorem and the EPR paradox together reveal a rather subtle way in which non-relativistic quantum mechanics is a consistent theory. For, if cloning were possible, then EPR correlations could be used to communicate faster than light, which leads to a contradiction (an effect preceding a cause) once the principles of special relativity are taken into account. To see this, observe that by generating many clones, and then measuring them in different bases, Bob could deduce unambiguously whether his member of an EPR pair is in a state of the basis $\{\ket{0}, \ket{1}\}$ or of the basis $\{\ket{+},\ket{-}\}$. Alice would communicate instanteously by forcing the EPR pair into one basis or the other through her choice of measurement axis (Glauber 1986). \subsection{Dense coding} We will discuss the following statement: {\em Quantum entanglement is an information resource.} Qubits can be used to store and transmit classical information. To transmit a classical bit string 00101, for example, Alice can send 5 qubits prepared in the state $\ket{00101}$. The receiver Bob can extract the information by measuring each qubit in the basis $\{ \ket{0}, \ket{1} \}$ (i.e. these are the eigenstates of the measured observable). The measurement results yield the classical bit string with no ambiguity. No more than one classical bit can be communicated for each qubit sent. Suppose now that Alice and Bob are in possession of an entangled pair of qubits, in the state $\ket{00} + \ket{11}$ (we will usually drop normalisation factors such as $\sqrt{2}$ from now on, to keep the notation uncluttered). Alice and Bob need never have communicated: we imagine a mechanical central facility generating entangled pairs and sending one qubit to each of Alice and Bob, who store them (see fig. 9a). In this situation, Alice can communicate {\em two} classical bits by sending Bob only {\em one} qubit (namely her half of the entangled pair). This idea due to Wiesner (Bennett and Wiesner 1992) is called ``dense coding'', since only one quantum bit travels from Alice to Bob in order to convey two classical bits. Two quantum bits are involved, but Alice only ever sees one of them. The method relies on the following fact: the four mutually orthogonal states $\ket{00} + \ket{11},\; \ket{00} - \ket{11}$, $\ket{01} + \ket{10},\;\ket{01} - \ket{10}$ can be generated from each other by operations on a single qubit. This set of states is called the Bell basis, since they exhibit the strongest possible Bell-EPR correlations (Braunstein {\em et. al.} 1992). Starting from $\ket{00} + \ket{11}$, Alice can generate any of the Bell basis states by operating on her qubit with one of the operators $\{I,X,Y,Z\}$. Since there are four possibilities, her choice of operation represents two bits of classical information. She then sends her qubit to Bob, who must deduce which Bell basis state the qubits are in. This he does by operating on the pair with the {\sc xor} gate, and measuring the target bit, thus distinguishing $\ket{00} \pm \ket{11}$ from $\ket{01} \pm \ket{10}$. To find the sign in the superposition, he operates with $H$ on the remaining qubit, and measures it. Hence Bob obtains two classical bits with no ambiguity. Dense coding is difficult to implement, and so has no practical value merely as a standard communication method. However, it can permit secure communication: the qubit sent by Alice will only yield the two classical information bits to someone in possession of the entangled partner qubit. More generally, dense coding is an example of the statement which began this section. It reveals a relationship between classical information, qubits, and the information content of quantum entanglement (Barenco and Ekert 1995). A laboratory demonstration of the main features is described by Mattle {\em et. al.} (1996); Weinfurter (1994) and Braunstein and Mann (1995) discuss some of the methods employed, based on a source of EPR photon pairs from parametric down-conversion. \subsection{Quantum teleportation} {\em It is possible to transmit qubits without sending qubits!} Suppose Alice wishes to communicate to Bob a single qubit in the state $\ket{\phi}$. If Alice already knows what state she has, for example $\ket{\phi} = \ket{0}$, she can communicate it to Bob by sending just classical information, eg ``Dear Bob, I have the state $\ket{0}$. Regards, Alice.'' However, if $\ket{\phi}$ is unknown there is no way for Alice to learn it with certainty: any measurement she may perform may change the state, and she cannot clone it and measure the copies. Hence it appears that the only way to transmit $\ket{\phi}$ to Bob is to send him the physical qubit (i.e. the electron or atom or whatever), or possibly to swap the state into another quantum system and send that. In either case a quantum system is transmitted. Quantum teleportation (Bennett {\em et. al.} 1993, Bennett 1995) permits a way around this limitation. As in dense coding, we will use quantum entanglement as an information resource. Suppose Alice and Bob possess an entangled pair in the state $\ket{00} + \ket{11}$. Alice wishes to transmit to Bob a qubit in an unknown state $\ket{\phi}$. Without loss of generality, we can write $\ket{\phi} = a \ket{0} + b \ket{1}$ where $a$ and $b$ are unknown coefficients. Then the initial state of all three qubits is \beq a\ket{000} + b\ket{100} + a\ket{011} + b\ket{111} \eeq Alice now measures in the Bell basis the first two qubits, i.e. the unknown one and her member of the entangled pair. The network to do this is shown in fig. 9b. After Alice has applied the {\sc xor} and Hadamard gates, and just before she measures her qubits, the state is \begin{eqnarray} \lefteqn{}&& \ket{00}\left( a\ket{0} + b \ket{1}\right) + \ket{01}\left( a\ket{1} + b \ket{0}\right) \nonumber \\ && \rule{-2ex}{0em} + \ket{10}\left( a\ket{0} - b \ket{1}\right) + \ket{11}\left( a\ket{1} - b \ket{0}\right). \end{eqnarray} Alice's measurements collapse the state onto one of four different possibilities, and yield two classical bits. The two bits are sent to Bob, who uses them to learn which of the operators $\{I,X,Z,Y\}$ he must apply to his qubit in order to place it in the state $a\ket{0} + b \ket{1} = \ket{\phi}$. Thus Bob ends up with the qubit (i.e. the quantum information, not the actual quantum system) which Alice wished to transmit. Note that the quantum information can only arrive at Bob if it disappears from Alice (no cloning). Also, quantum information is complete information: $\ket{\phi}$ is the complete description of Alice's qubit. The use of the word `teleportation' draws attention to these two facts. Teleportation becomes an especially important idea when we come to consider communication in the presence of noise, section \ref{s:qec}. \subsection{Quantum data compression} \lab{s:qdc} Having introduced the qubit, we now wish to show that it is a useful measure of quantum information content. The proof of this is due to Jozsa and Schumacher (1994) and Schumacher (1995), building on work of Kholevo (1973) and Levitin (1987). To begin the argument, we first need a quantity which expresses how much information you would gain if you were to learn the quantum state of some system $\cal Q$. A suitable quantity is the Von Neumann entropy \beq S(\rho) = - {\rm Tr} \rho \log \rho \eeq where Tr is the trace operation, and $\rho$ is the density operator describing an ensemble of states of the quantum system. This is to be compared with the classical Shannon entropy, equation (\ref{S}). Suppose a classical random variable $X$ has a probability distribution $p(x)$. If a quantum system is prepared in a state $\ket{x}$ dictated by the value of $X$, then the density matrix is $\sum_x p(x) \ket{x} \bra{x}$, where the states $\ket{x}$ need not be orthogonal. It can be shown (Kholevo 1973, Levitin 1987) that $S(\rho)$ is an upper limit on the classical mutual information $I(X:Y)$ between $X$ and the result $Y$ of a measurement on the system. To make connection with qubits, we consider the resources needed to store or transmit the state of a quantum system $q$ of density matrix $\rho$. The idea is to collect $n \gg 1$ such systems, and transfer (`encode') the joint state into some smaller system. The smaller system is transmitted down the channel, and at the receiving end the joint state is `decoded' into $n$ systems $q'$ of the same type as $q$ (see fig. 9c). The final density matrix of each $q'$ is $\rho'$, and the whole process is deemed successful if $\rho'$ is sufficiently close to $\rho$. The measure of the similarity between two density matrices is the {\em fidelity} defined by \beq f(\rho, \rho') = \left( {\rm Tr} \sqrt{\rho^{1/2} \rho' \rho^{1/2} } \right)^2 \eeq This can be interpreted as the probability that $q'$ passes a test which ascertained if it was in the state $\rho$. When $\rho$ and $\rho'$ are both pure states, $\ket{\phi}\bra{\phi}$ and $\ket{\phi'}\bra{\phi'}$, the fidelity is none other than the familiar overlap: $f = | \left< \phi \right| \left. \phi' \right> |^2$. Our aim is to find the smallest transmitted system which permits $f = 1 - \epsilon$ for $\epsilon \ll 1$. The argument is analogous to the `typical sequences' idea used in section \ref{s:dc}. Restricting ourselves for simplicity to two-state systems, the total state of $n$ systems is represented by a vector in a Hilbert space of $2^n$ dimensions. However, if the von Neumann entropy $S(\rho) < 1$ then it is highly likely (i.e. tends to certainty in the limit of large $n$) that, in any given realisation, the state vector actually falls in a {\em typical sub-space} of Hilbert space. Schumacher and Jozsa showed that the dimension of the typical sub-space is $2^{n S(\rho)}$. Hence only $n S(\rho)$ qubits are required to represent the quantum information faithfully, and the qubit (i.e. the logarithm of the dimensionality of Hilbert space) is a useful measure of quantum information. Furthermore, the encoding and decoding operation is `blind': it does not depend on knowledge of the exact states being transmitted. Schumacher and Josza's result is powerful because it is general: no assumptions are made about the exact nature of the quantum states involved. In particular, they need not be orthogonal. If the states to be transmitted were mutually orthogonal, the whole problem would reduce to one of classical information. The `encoding' and `decoding' required to achieve such quantum data compression and decompression is technologically very demanding. It cannot at present be done at all using photons. However, it is the ultimate compression allowed by the laws of physics. The details of the required quantum networks have been deduced by Cleve and DiVincenzo (1996). As well as the essential concept of information, other classical ideas such as Huffman coding have their quantum counterparts. Furthermore, Schumacher and Nielson (1996) derive a quantity which they call `coherent information' which is a measure of mutual information for quantum systems. It includes that part of the mutual information between entangled systems which cannot be accounted for classically. This is a helpful way to understand the Bell-EPR correlations. \subsection{Quantum cryptography} No overview of quantum information is complete without a mention of quantum cryptography. This area stems from an unpublished paper of Wiesner written around 1970 (Wiesner 1983). It includes various ideas whereby the properties of quantum systems are used to achieve useful cryptographic tasks, such as secure (i.e. secret) communication. The subject may be divided into quantum {\em key distribution}, and a collection of other ideas broadly related to {\em bit commitment}. Quantum key distribution will be outlined below. Bit commitment refers to the scenario in which Alice must make some decision, such as a vote, in such a way that Bob can be sure that Alice fixed her vote before a given time, but where Bob can only learn Alice's vote at some later time which she chooses. A classical, cumbersome method to achieve bit commitment is for Alice to write down her vote and place it in a safe which she gives to Bob. When she wishes Bob, later, to learn the information, she gives him the key to the safe. A typical quantum protocol is a carefully constructed variation on the idea that Alice provides Bob with a prepared qubit, and only later tells him in what basis it was prepared. The early contributions to the field of quantum cryptography were listed in the introduction, further references may be found in the reviews mentioned at the beginning of this section. Cryptography has the unusual feature that it is not possible to prove by experiment that a cryptographic procedure is secure: who knows whether a spy or cheating person managed to beat the system? Instead, the users' confidence in the methods must rely on mathematical proofs of security, and it is here that much important work has been done. A concerted effort has enabled proofs to be established for the security of correctly implemented quantum key distribution. However, the bit commitment idea, long thought to be secure through quantum methods, was recently proved to be insecure (Mayers 1997, Lo and Chau 1997) because the participants can cheat by making use of quantum entanglement. Quantum key distribution is a method in which quantum states are used to establish a random secret key for cryptography. The essential ideas are as follows: Alice and Bob are, as usual, widely seperated and wish to communicate. Alice sends to Bob $2n$ qubits, each prepared in one of the states $\ket{0},\ket{1},\ket{+},\ket{-}$, randomly chosen\footnote{Many other methods are possible, we adopt this one merely to illustrate the concepts.}. Bob measures his received bits, choosing the measurement basis randomly between $\{\ket{0},\ket{1}\}$ and $\{\ket{+},\ket{-}\}$. Next, Alice and Bob inform each other publicly (i.e. anyone can listen in) of the basis they used to prepare or measure each qubit. They find out on which occasions they by chance used the same basis, which happens on average half the time, and retain just those results. In the absence of errors or interference, they now share the same random string of $n$ classical bits (they agree for example to associate $\ket{0}$ and $\ket{+}$ with 0; $\ket{1}$ and $\ket{-}$ with 1). This classical bit string is often called the {\em raw quantum transmission}, RQT. So far nothing has been gained by using qubits. The important feature is, however, that it is impossible for anyone to learn Bob's measurement results by observing the qubits {\em en route}, without leaving evidence of their presence. The crudest way for an eavesdopper Eve to attempt to discover the key would be for her to intercept the qubits and measure them, then pass them on to Bob. On average half the time Eve guesses Alice's basis correctly and thus does not disturb the qubit. However, Eve's correct guesses do not coincide with Bob's, so Eve learns the state of half of the $n$ qubits which Alice and Bob later decide to trust, and disturbs the other half, for example sending to Bob $\ket{+}$ for Alice's $\ket{0}$. Half of those disturbed will be projected by Bob's measurement back onto the original state sent by Alice, so overall Eve corrupts $n/4$ bits of the RQT. Alice and Bob can now detect Eve's presence simply by randomly choosing $n/2$ bits of the RQT and announcing publicly the values they have. If they agree on all these bits, then they can trust that no eavesdropper was present, since the probability that Eve was present and they happened to choose $n/2$ uncorrupted bits is $(3/4)^{n/2} \simeq 10^{-125}$ for $n=1000$. The $n/2$ undisclosed bits form the secret key. In practice the protocol is more complicated since Eve might adopt other strategies (e.g. not intercept all the qubits), and noise will currupt some of the qubits even in the absence of an evesdropper. Instead of rejecting the key if many of the disclosed bits differ, Alice and Bob retain it as long as they find the error rate to be well below $25\%$. They then process the key in two steps. The first is to detect and remove errors, which is done by publicly comparing parity checks on publicly chosen random subsets of the bits, while discarding bits to prevent increasing Eve's information. The second step is to decrease Eve's knowledge of the key, by distilling from it a smaller key, composed of parity values calculated from the original key. In this way a key of around $n/4$ bits is obtained, of which Eve probably knows less than $10^{-6}$ of one bit (Bennett {\em et. al.} 1992). The protocol just described is not the only one possible. Another approach (Ekert 1991) involves the use of EPR pairs, which Alice and Bob measure along one of three different axes. To rule out eavesdropping they check for Bell-EPR correlations in their results. The great thing about quantum key distribution is that it is feasible with current technology. A pioneering experiment (Bennett and Brassard 1989) demonstrated the principle, and much progress has been made since then. Hughes {\em et. al.} (1995) and Phoenix and Townsend (1995) summarised the state of affairs two years ago, and recently Zbinden {\em et. al.} (1997) have reported excellent key distribution through 23 km of standard telecom fibre under lake Geneva. The qubits are stored in the polarisation states of laser pulses, i.e. coherent states of light, with on average $0.1$ photons per pulse. This low light level is necessary so that pulses containing more than one photon are unlikely. Such pulses would provide duplicate qubits, and hence a means for an evesdropper to go undetected. The system achieves a bit error rate of $1.35\%$, which is low enough to guarantee privacy in the full protocol. The data transmission rate is rather low: MHz as opposed to the GHz rates common in classical communications, but the system is very reliable. Such spectacular experimental mastery is in contrast to the subject of the next section. \section{The universal quantum computer} \lab{s:uqc} We now have sufficient concepts to understand the jewel at the heart of quantum information theory, namely, the quantum computer (QC). Ekert and Jozsa (1996) and Barenco (1996) give introductory reviews concentrating on the quantum computer and factorisation; a review with emphasis on practicalities is provided by Spiller (1996). Introductory material is also provided by DiVincenzo (1995b) and Shor (1996). The QC is first and foremost a machine which is a theoretical construct, like a thought-experiment, whose purpose is to allow quantum information processing to be formally analysed. In particular it establishes the Church-Turing Principle introduced in section \ref{s:qvc}. Here is a prescription for a quantum computer, based on that of Deutsch (1985, 1989): A quantum computer is a set of $n$ qubits in which the following operations are experimentally feasible: \begin{enumerate} \item Each qubit can be prepared in some known state $\ket{0}$. \item Each qubit can be measured in the basis $\{ \ket{0}, \ket{1} \}$. \item A universal quantum gate (or set of gates) can be applied at will to any fixed-size subset of the qubits. \item The qubits do not evolve other than via the above transformations. \end{enumerate} This prescription is incomplete in certain technical ways to be discussed, but it encompasses the main ideas. The model of computation we have in mind is a network model, in which logic gates are applied sequentially to a set of bits (here, quantum bits). In an electronic classical computer, logic gates are spread out in space on a circuit board, but in the QC we typically imagine the logic gates to be interactions turned on and off in time, with the qubits at fixed positions, as in a quantum network diagram (fig. 8, 12). Other models of quantum computation can be conceived, such as a cellular automaton model (Margolus 1990). \subsection{Universal gate} The universal quantum gate is the quantum equivalent of the classical universal gate, namely a gate which by its repeated use on different combinations of bits can generate the action of any other gate. What is the set of all possible quantum gates, however? To answer this, we appeal to the principles of quantum mechanics (Schr\"odinger's equation), and answer that since all quantum evolution is unitary, it is sufficient to be able to generate {\em all unitary transformations} of the $n$ qubits in the computer. This might seem a tall order, since we have a continuous and therefore infinite set. However, it turns out that quite simple quantum gates can be universal, as Deutsch showed in 1985. The simplest way to think about universal gates is to consider the pair of gates $V(\theta, \phi)$ and controlled-not (or {\sc xor}), where $V(\theta, \phi)$ is a general rotation of a single qubit, ie \beq V(\theta, \phi) = \left( \begin{array}{lr} \cos (\theta/2) & -i e^{-i\phi} \sin (\theta/2) \\ -i e^{i\phi} \sin (\theta/2) & \cos (\theta/2) \end{array} \right). \lab{V} \eeq It can be shown that any $n \times n$ unitary matrix can be formed by composing 2-qubit {\sc xor} gates and single-qubit rotations. Therefore, this pair of operations is universal for quantum computation. A purist may argue that $V(\theta, \phi)$ is an infinite set of gates since the parameters $\theta$ and $\phi$ are continuous, but it suffices to choose two particular irrational angles for $\theta$ and $\phi$, and the resulting single gate can generate all single-qubit rotations by repeated application; however, a practical system need not use such laborious methods. The {\sc xor} and rotation operations can be combined to make a controlled rotation which is a single universal gate. Such universal quantum gates were discussed by Deutsch {\em et. al.} (1995), Lloyd (1995), DiVincenzo (1995a) and Barenco (1995). It is remarkable that 2-qubit gates are sufficient for quantum computation. This is why the quantum gate is a powerful and important concept. \subsection{Church-Turing principle} Having presented the QC, it is necessary to argue for its universality, i.e. that it fulfills the Church-Turing Principle as claimed. The two-step argument is very simple. First, the state of any finite quantum system is simply a vector in Hilbert space, and therefore can be represented to arbitrary precision by a finite number of qubits. Secondly, the evolution of any finite quantum system is a unitary transformation of the state, and therefore can be simulated on the QC, which can generate any unitary transformation with arbitrary precision. A point of principle is raised by Myers (1997), who points out that there is a difficulty with computational tasks for which the number of steps for completion cannot be predicted. We cannot in general observe the QC to find out if it has halted, in contrast to a classical computer. However, we will only be concerned with tasks where either the number of steps is predictable, or the QC can signal completion by setting a dedicated qubit which is otherwise not involved in the computation (Deutsch 1985). This is a very broad class of problems. Nielsen and Chuang (1997) consider the use of a {\em fixed} quantum gate array, showing that there is no array which, operating on qubits representing both data and program, can perform any unitary transformation on the data. However, we consider a machine in which a classical computer controls the quantum gates applied to a quantum register, so any gate array can be `ordered' by a classical program to the classical computer. The QC is certainly an interesting theoretical tool. However, there hangs over it a large and important question-mark: what about imperfection? The prescription given above is written as if measurements and gates can be applied with arbitrary precision, which is unphysical, as is the fourth requirement (no extraneous evolution). The prescription can be made realistic by attaching to each of the four requirements a statement about the degree of allowable imprecision. This is a subject of on-going research, and we will take it up in section \ref{s:qec}. Meanwhile, let us investigate more specifically what a sufficiently well-made quantum computer might do. \section{Quantum algorithms} \lab{s:qa} It is well known that classical computers are able to calculate the behaviour of quantum systems, so we have not yet demonstrated that a quantum computer can do anything which a classical computer can not. Indeed, since our theories of physics always involve equations which we can write down and manipulate, it seems highly unlikely that quantum mechanics, or any future physical theory, would permit computational problems to be addressed which are not in principle solvable on a large enough classical Turing machine. However, as we saw in section \ref{s:cc}, those words `large enough', and also `fast enough', are centrally important in computer science. Problems which are computationally `hard' can be impossible in practice. In technical language, while quantum computing does not enlarge the set of computational problems which can be addressed (compared to classical computing), it does introduce the possibility of new complexity classes. Put more simply, tasks for which classical computers are too slow may be solvable with quantum computers. \subsection{Simulation of physical systems} The first and most obvious application of a QC is that of simulating some other quantum system. To simulate a state vector in a $2^n$-dimensional Hilbert space, a classical computer needs to manipulate vectors containing of order $2^n$ complex numbers, whereas a quantum computer requires just $n$ qubits, making it much more efficient in storage space. To simulate evolution, in general both the classical and quantum computers will be inefficient. A classical computer must manipulate matrices containing of order $2^{2n}$ elements, which requires a number of operations (multiplication, addition) exponentially large in $n$, while a quantum computer must build unitary operations in $2^n$-dimensional Hilbert space, which usually requires an exponentially large number of elementary quantum logic gates. Therefore the quantum computer is not guaranteed to simulate {\em every} physical system efficiently. However, it can be shown that it can simulate a large class of quantum systems efficiently, including many for which there is no efficient classical algorithm, such as many-body systems with local interactions (Lloyd 1996, Zalka 1996, Wiesner 1996, Meyer 1996, Lidar and Biam 1996, Abrams and Lloyd 1997, Boghosian and Taylor 1997). \subsection{Period finding and Shor's factorisation algorithm} So far we have discussed simulation of Nature, which is a rather restricted type of computation. We would like to let the QC loose on more general problems, but it has so far proved hard to find ones on which it performs better than classical computers. However, the fact that there exist such problems at all is a profound insight into physics, and has stimulated much of the recent interest in the field. Currently one of the most important quantum algorithms is that for finding the period of a function. Suppose a function $f(x)$ is periodic with period $r$, i.e. $f(x) = f(x + r)$. Suppose further that $f(x)$ can be efficiently computed from $x$, and all we know initially is that $N/2 < r < N$ for some $N$. Assuming there is no analytic technique to deduce the period of $f(x)$, the best we can do on a classical computer is to calculate $f(x)$ for of order $N/2$ values of $x$, and find out when the function repeats itself (for well-behaved functions only $O(\sqrt{N})$ values may be needed on average). This is inefficient since the number of operations is exponential in the input size $\log N$ (the information required to specify $N$). The task can be solved efficiently on a QC by the elegant method shown in fig. 10, due to Shor (1994), building on Simon (1994). The QC requires $2n$ qubits, plus a further $0(n)$ for workspace, where $n = \lceil 2 \log N \rceil$ (the notation $\lceil x \rceil$ means the nearest integer greater than $x$). These are divided into two `registers', each of $n$ qubits. They will be referred to as the $x$ and $y$ registers; both are initially prepared in the state $\ket{0}$ (i.e. all $n$ qubits in states $\ket{0}$). Next, the operation $H$ is applied to each qubit in the $x$ register, making the total state \beq \frac{1}{\sqrt{w}} \sum_{x=0}^{w-1} \ket{x} \ket{0} \label{step1} \eeq where $w = 2^n$. This operation is referred to as a Fourier transform in fig. 10, for reasons that will shortly become apparant. The notation $\ket{x}$ means a state such as $\ket{0011010}$, where $0011010$ is the integer $x$ in binary notation. In this context the basis $\{ \ket{0}, \ket{1} \}$ is referred to as the `computational basis.' It is convenient (though not of course necessary) to use this basis when describing the computer. Next, a network of logic gates is applied to both $x$ and $y$ regisiters, to perform the transformation $U_f \ket{x} \ket{0} = \ket{x} \ket{f(x)}$. Note that this transformation can be unitary because the input state $\ket{x} \ket{0}$ is in one to one correspondance with the output state $\ket{x} \ket{f(x)}$, so the process is reversible. Now, applying $U_f$ to the state given in eq. (\ref{step1}), we obtain \beq \frac{1}{\sqrt{w}} \sum_{x=0}^{w-1} \ket{x} \ket{f(x)} \label{step2} \eeq This state is illustrated in fig. 11a. At this point something rather wonderful has taken place: the value of $f(x)$ has been calculated for $w=2^n$ values of $x$, all in one go! This feature is referred to as {\em quantum parallelism} and represents a huge parallelism because of the exponential dependence on $n$ (imagine having $2^{100}$, i.e. a million times Avagadro's number, of classical processors!) Although the $2^n$ evaluations of $f(x)$ are in some sense `present' in the quantum state in eq. (\ref{step2}), unfortunately we cannot gain direct access to them. For, a measurement (in the computational basis) of the $y$ register, which is the next step in the algorithm, will only reveal one value of $f(x)$\footnote{It is not strictly necessary to measure the $y$ register, but this simplifies the description.}. Suppose the value obtained is $f(x) = u$. The $y$ register state collapses onto $\ket{u}$, and the total state becomes \beq \frac{1}{\sqrt{M}} \sum_{j=0}^{M-1} \ket{d_u+jr} \ket{u} \label{step3} \eeq where $d_u + j r$, for $j=0,1,2 \ldots M-1$, are all the values of $x$ for which $f(x) = u$. In other words the periodicity of $f(x)$ means that the $x$ register remains in a superposition of $M \simeq w/r$ states, at values of $x$ separated by the period $r$. Note that the offset $d_u$ of the set of $x$ values depends on the value $u$ obtained in the measurement of the $y$ register. It now remains to extract the periodicity of the state in the $x$ register. This is done by applying a Fourier transform, and then measuring the state. The discrete Fourier transform employed is the following unitary process: \beq U_{\cal FT} \ket{x} = \frac{1}{\sqrt{w}} \sum_{k=0}^{w-1} e^{i 2\pi k x /w} \ket{k} \eeq Note that eq. (\ref{step1}) is an example of this, operating on the initial state $\ket{0}$. The quantum network to apply $U_{\cal FT}$ is based on the fast Fourier transform algorithm (see, e.g., Knuth (1981)). The quantum version was worked out by Coppersmith (1994) and Deutsch (1994) independently, a clear presentation may also be found in Ekert and Josza (1996), Barenco (1996)\footnote{An exact quantum Fourier transform would require rotation operations of precision exponential in $n$, which raises a problem with the efficiency of Shor's algorithm. However, an approximate version of the Fourier transform is sufficient (Barenco {\em et. al.} 1996)}. Before applying $U_{\cal FT}$ to eq. (\ref{step3}) we will make the simplifying assumption that $r$ divides $w$ exactly, so $M = w/r$. The essential ideas are not affected by this restriction; when it is relaxed some added complications must be taken into account (Shor 1994, 1995a; Ekert and Josza 1996). The $y$ register no longer concerns us, so we will just consider the $x$ state from eq. (\ref{step3}): \beq U_{\cal FT} \frac{1}{\sqrt{w/r}} \sum_{j=0}^{w/r-1} \ket{d_u + jr} = \frac{1}{\sqrt{r}} \sum_k \tilde{f} (k) \ket{k} \label{step4} \eeq where \beq |\tilde{f} (k)| = \left\{ \begin{array}{ll} 1 \;\;\; & \mbox{if $k$ is a multiple of $w/r$} \\ 0 & \mbox{otherwise} \end{array} \right. \eeq This state is illustrated in fig. 11b. The final state of the $x$ register is now measured, and we see that the value obtained must be a multiple of $w/r$. It remains to deduce $r$ from this. We have $x = \lambda w/r$ where $\lambda$ is unknown. If $\lambda$ and $r$ have no common factors, then we cancel $x/w$ down to an irreducible fraction and thus obtain $\lambda$ and $r$. If $\lambda$ and $r$ have a common factor, which is unlikely for large $r$, then the algorithm fails. In this case, the whole algorithm must be repeated from the start. After a number of repetitions no greater than $\sim \log r$, and usually much less than this, the probability of success can be shown to be arbitrarily close to $1$ (Ekert and Josza 1996). The quantum period-finding algorithm we have described is efficient as long as $U_f$, the evaluation of $f(x)$, is efficient. The total number of elementary logic gates required is a polynomial rather than exponential function of $n$. As was emphasised in section \ref{s:cc}, this makes all the difference between tractable and intractable in practice, for sufficiently large $n$. To add the icing on the cake, it can be remarked that the important factorisation problem mentioned in section \ref{s:cc} can be reduced to one of finding the period of a simple function. This and all the above ingredients were first brought together by Shor (1994), who thus showed that the factorisation problem is tractable on an ideal quantum computer. The function to be evaluated in this case is $f(x) = a^x \;{\rm mod}\;N$ where $N$ is the number to be factorised, and $a < N$ is chosen randomly. One can show using elementary number theory (Ekert and Josza 1996) that for most choices of $a$, the period $r$ is even and $a^{r/2} \pm 1$ shares a common factor with $N$. The common factor (which is of course a factor $N$) can then be deduced rapidly using a classical algorithm due to Euclid ({\em circa} 300 BC; see, e.g. Hardy and Wright 1965). To evaluate $f(x)$ efficiently, repeated squaring (modulo $N$) is used, giving powers $((a^2)^2)^2 \ldots$. Selected such powers of $a$, corresponding to the binary expansion of $a$, are then multiplied together. Complete networks for the whole of Shor's algorithm were described by Miquel {\em et. al.} (1996), Vedral {\em et. al.} (1996) and Beckman {\em et. al.} (1996). They require of order $300 (\log N)^3$ logic gates. Therefore, to factorise numbers of order $10^{130}$, i.e. at the limit of current classical methods, would require $\sim 2 \times 10^{10}$ gates per run, or 7 hours if the `switching rate' is one megaHertz\footnote{The algorithm might need to be run $\log r \sim 60$ times to ensure at least one successful run, but the average number of runs required will be much less than this.}. Considering how difficult it is to make a quantum computer, this offers no advantage over classical computation. However, if we double the number of digits to 260 then the problem is intractable classically (see section \ref{s:cc}), while the ideal quantum computer takes just 8 times longer than before. The existence of such a powerful method is an exciting and profound new insight into quantum theory. The period-finding algorithm appears at first sight like a conjuring trick: it is not quite clear how the quantum computer managed to produce the period like a rabbit out of a hat. Examining fig. 11 and equations (\ref{step1}) to (\ref{step4}), I would say that the most important features are contained in eq. (\ref{step2}). They are not only the {\em quantum parallelism} already mentioned, but also {\em quantum entanglement}, and, finally, quantum interference. Each value of $f(x)$ retains a link with the value of $x$ which produced it, through the entanglement of the $x$ and $y$ registers in eq. (\ref{step2}). The `magic' happens when a measurement of the $y$ register produces the special state $\ket{\psi}$ (eq. \ref{step3}) in the $x$ register, and it is quantum entanglement which permits this (see also Jozsa 1997a). The final Fourier transform can be regarded as an interference between the various superposed states in the $x$ register (compare with the action of a diffraction grating). Interference effects can be used for computational purposes with classical light fields, or water waves for that matter, so interference is not in itself the essentially quantum feature. Rather, the exponentially large number of interfering states, and the entanglement, are features which do not arise in classical systems. \subsection{Grover's search algorithm} Despite considerable efforts in the quantum computing community, the number of useful quantum algorithms which have been discovered remains small. They consist mainly of variants on the period-finding algorithm presented above, and another quite different task: that of searching an unstructured list. Grover (1997) presented a quantum algorithm for the following problem: given an unstructured list of items $\{ x_i \}$, find a particular item $x_j = t$. Think, for example, of looking for a particular telephone number in the telephone directory (for someone whose name you do not know). It is not hard to prove that classical algorithms can do no better than searching through the list, requiring on average $N/2$ steps, for a list of $N$ items. Grover's algorithm requires of order $\sqrt{N}$ steps. The task remains computationally hard: it is not transferred to a new complexity class, but it is remarkable that such a seemingly hopeless task can be speeded up at all. The `quantum speed-up' $\sim \sqrt{N}/2$ is greater than that achieved by Shor's factorisation algorithm ($\sim \exp( 2 (\ln N)^{1/3} )$), and would be important for the huge sets ($N \simeq 10^{16}$) which can arise, for example, in code-breaking problems (Brassard 1997). An important further point was proved by Bennett {\em et. al.} (1997), namely that Grover's algorithm is optimal: no quantum algorithm can do better than $O(\sqrt{N})$. A brief sketch of Grover's algorithm is as follows. Each item has a label $i$, and we must be able to test in a unitary way whether any item is the one we are seeking. In other words there must exist a unitary operator $S$ such that $S \ket{i} = \ket{i}$ if $i \ne j$, and $S \ket{j} = - \ket{j}$, where $j$ is the label of the special item. For example, the test might establish whether $i$ is the solution of some hard computational problem\footnote{That is, an ``{\sc np}'' problem for which finding a solution is hard, but testing a proposed solution is easy.}. The method begins by placing a single quantum register in a superposition of all computational states, as in the period-finding algorithm (eq. (\ref{step1})). Define \beq \ket{ \Psi (\theta ) } \equiv \sin \theta \ket{j} + \frac{ \cos \theta }{\sqrt{N-1}} \sum_{i \ne j} \ket{i} \eeq where $j$ is the label of the element $t = x_j$ to be found. The initially prepared state is an equally-weighted superposition, $\ket{ \Psi (\theta_0 ) }$ where $\sin \theta_0 = 1 / \sqrt{N}$. Now apply $S$, which reverses the sign of the one special element of the superposition, then Fourier transform, change the sign of all components except $\ket{0}$, and Fourier transform back again. These operations represent a subtle interference effect which achieves the following transformation: \beq U_G \ket{ \theta } = \ket{ \Psi (\theta + \phi ) } \label{UG} \eeq where $\sin \phi = 2 \sqrt{N-1} / N$. The coefficient of the special element is now slightly larger than that of all the other elements. The method proceeds simply by applying $U_G$ $m$ times, where $m \simeq (\pi/4) \sqrt{N}$. The slow rotation brings $\theta$ very close to $\pi/2$, so the quantum state becomes almost precisely equal to $\ket{j}$. After the $m$ iterations the state is measured and the value $j$ obtained (with error probability $O(1/N)$). If $U_G$ is applied too many times, the success probability diminishes, so it is important to know $m$, which was deduced by Boyer {\em et. al.} (1996). Kristen Fuchs compares the technique to cooking a souffl\'e. The state is placed in the `quantum oven' and the desired answer rises slowly. You must open the oven at the right time, neither too soon not too late, to guarantee success. Otherwise the souffl\'e will fall---the state collapses to the wrong answer. The two algorithms I have presented are the easiest to describe, and illustrate many of the methods of quantum computation. However, just what further methods may exist is an open question. Kitaev (1996) has shown how to solve the factorisation and related problems using a technique fundamentally different from Shor's. His ideas have some similarities to Grover's. Kitaev's method is helpfully clarified by Jozsa (1997b) who also brings out the common features of several quantum algorithms based on Fourier transforms. The quantum programmer's toolbox is thus slowly growing. It seems safe to predict, however, that the class of problems for which quantum computers out-perform classical ones is a special and therefore small class. On the other hand, any problem for which finding solutions is hard, but testing a candidate solution is easy, can at last resort be solved by an exhaustive search, and here Grover's algorithm may prove very useful.