\section{Quantum verses classical physics} \label{s:qvc}In order to think about quantum information theory, let us firststate the principles of non-relativisitic quantum mechanics,as follows (Shankar 1980).\begin{enumerate}\item The state of an isolated system $\cal Q$ is represented by a vector$\ket{\psi(t)}$ in a Hilbert space.\item Variables such as position and momentum are termedobservables and are represented by Hermitian operators.The position and momentum operators $X,P$ have the followingmatrix elements in the eigenbasis of $X$:  \begin{eqnarray*}    \bra{x} X \ket{x'} &=& x \delta (x-x') \\    \bra{x} P \ket{x'} &=& -i \hbar \delta' (x-x')  \end{eqnarray*}\item The state vector obeys the Schr\"odinger equation\beqi \hbar \frac{d}{dt} \ket{\psi(t)} = {\cal H} \ket{\psi(t)}  \label{Sch}\eeqwhere ${\cal H}$ is the quantum Hamiltonian operator.\item Measurement postulate.  \end{enumerate}The fourth postulate, which has not been made explicit, is a subject of some debate, since quite different interpretive approaches lead to the same predictions, and the concept of `measurement' is fraught with ambiguities in quantum mechanics (Wheeler and Zurek 1983, Bell 1987, Peres 1993).A statement which is valid for most practical purposes is that certain physical interactions are recognisably `measurements',and their effect on the state vector $\ket{\psi}$ is to changeit to an eigenstate $\ket{k}$ of the variable being measured, the value of $k$ being randomly chosen with probability $P \propto |\left< k \right. \ket{\psi}|^2$. Thechange $\ket{\psi} \rightarrow \ket{k}$ can be expressed by theprojection operator $(\ket{k}\bra{k})/\left< k \right. \ket{\psi}$.Note that according to the above equations, the evolution of an isolated quantum system is always {\em unitary}, in other words $\ket{\psi(t)} = U(t) \ket{\psi(0)}$ where $U(t) = \exp(-i \int {\cal H} dt / \hbar)$ is a unitary operator, $U U^{\dagger} = I$. This is true, but there is a difficulty that there is no such thing as a truly isolated system (i.e. one which experiences no interactions with any other systems), except possibly the whole universe. Therefore there is always some approximation involved in using the Schr\"odinger equation to describe real systems. One way to handle this approximation is to speak of the system $\cal Q$ and its environment $\cal T$. The evolution of $\cal Q$ is primarily that given by its Schr\"odinger equation, but the interaction between $\cal Q$ and $\cal T$ has, in part, the character of a measurement of $\cal Q$. This produces a non-unitary contribution to the evolution of $\cal Q$(since projections are not unitary), and this ubiquitous phenomenonis called {\em decoherence}. I have underlined these elementary ideasbecause they are central in what follows.We can now begin to bring together ideas of physics and ofinformation processing. For, it is clear that much of the wonderfulbehaviour we see around us in Nature could be understood as a formof information processing, and conversely our computers are ableto simulate, by their processing, many of the patterns of Nature.The obvious, if somewhat imprecise, questions are \begin{enumerate}\item ``can Natureusefully be regarded as essentially an information processor?''\item``could a computer simulate the whole of Nature?''\end{enumerate}The principles of quantum mechanics suggest that the answer to the first quesion is {\em yes}\footnote{This does not necessarily imply that such language captures everthing that can be said about Nature, merely that this is a useful abstraction at the descriptive level of physics. Ido not believe any physical `laws' could be adequate to completely describe human behaviour, for example, since they are sufficiently approximate or non-prescriptive to leave us room for manoeuvre (Polkinghorne 1994).}. For, the state vector $\ket{\psi}$ so central to quantum mechanics is a concept very much like those of information science: it is an abstract entity which contains exactly all the information about the system $\cal Q$. The word `exactly' here is a reminder that not only is $\ket{\psi}$ a complete description of $Q$, it is also one that does not contain any extraneous information which can not meaningfully be associated with $\cal Q$. The importance of this in quantum statistics of Fermi and Bose gases was mentioned in the introduction. The second question can be made more precise by convertingthe Church-Turing thesis into a principle of physics,{\em Every finitely realizible physical system can be simulated arbitrarily closelyby a universal model computing machine operating byfinite means.}This statement is based on that of Deutsch (1985). The idea is to propose that a principle like this is not derived from quantum mechanics, but rather underpins it, like other principles such as that of conservation of energy. The qualifications introduced by `finitely realizible' and `finite means' are important in order to state something useful.The new version of the Church-Turing thesis (now called the `Church-Turing Principle') does not refer to Turing machines. This is important because there are fundamental differences between the very nature of the Turing machine and the principles of quantum mechanics. One is described in terms of operations on classical bits, the other in terms of evolution of quantum states. Hence there is the possibility that the universal Turing machine, and hence all classical computers, might not be able to simulate some of the behaviour to be found in Nature. Conversely, it may be physically possible (i.e. not ruled out by the laws of Nature) to realise a new type of computation essentially different from that of classical computer science. This is the central aim of quantum computing. \subsection{EPR paradox, Bell's inequality}  \lab{s:EPR}In 1935 Einstein, Podolski and Rosen (EPR) drew attention to an important feature of non-relativistic quantum mechanics. Their argument, and Bell's analysis, can now be recognised as one of the seeds from which quantum information theory has grown. The EPR paradox should be familiar to any physics graduate, and I will not repeat the argument in detail. However, the main points will provide a useful way in to quantum information concepts. The EPR thought-experiment can be reduced in essence to an experiment involving pairs of two-state quantum systems (Bohm 1951, Bohm and Aharonov1957). Let us consider a pair of spin-half particles $A$ and $B$, writing the ($m_z = +1/2$)spin `up' state  $\ket{\uparrow}$ and the ($m_z = -1/2$) spin `down' state $\ket{\downarrow}$. The particles are prepared initially in the singlet state $(\ket{\uparrow}\ket{\downarrow} - \ket{\downarrow}\ket{\uparrow})/ \sqrt{2}$, and they subsequently fly apart, propagating in opposite directions along the $y$-axis. Alice and Bob are widely separated, and they receive particle $A$ and $B$ respectively. EPR were concerned with whether quantum mechanics provides a complete description of the particles, or whether something was left out, some property of the spin angular momenta ${\bf s}_A,{\bf s}_B$ which quantum theory failed to describe. Such a property has since become known as a `hidden variable'. They argued that something was left out, because this experiment allows one to predict with certainty the result of measuring any component of ${\bf s}_B$, without causing any disturbance of $B$. Therefore all thecomponents of ${\bf s}_B$ have definite values, say EPR, and thequantum theory only provides an incomplete description. To make the certain prediction without disturbing $B$, one chooses any axis $\eta$ along which one wishes to know $B$'s angular momentum, and then measures not $B$ but $A$, using a Stern-Gerlach apparatus aligned along $\eta$. Since the singlet state carries no net angular momentum, one can be sure that the corresponding measurement on $B$ would yield the opposite result to the one obtained for $A$. The EPR paper is important because it is carefully argued, and thefallacy is hard to unearth. The fallacy can be exposed in oneof two ways: one can say either that Alice's measurement does influenceBob's particle, or (which I prefer) that the quantum state vector$\ket{\phi}$ is not an intrinsic property of a quantum system, butan expression for the information content of a quantum variable.In a singlet statethere is mutual information between $A$ and $B$, so the informationcontent of $B$ changes when we learn something about $A$. So farthere is no difference from the behaviour of classical information,so nothing surprising has occurred.A more thorough analysis of the EPR experiment yields a big surprise. This was discovered by Bell (1964,1966). Suppose Alice and Bob measure the spin component of $A$ and $B$ along different axes $\eta_A$ and $\eta_B$ in the $x$-$z$ plane. Each measurement yields an answer $+$or $-$. Quantum theory and experiment agree that the probability for the two measurements to yield the same result is $\sin^2((\phi_A - \phi_B)/2)$, where $\phi_A$ ($\phi_B$) is the angle between $\eta_A$ ($\eta_B$) and the $z$ axis. However, there is no way to assign {\em local} properties, that is properties of $A$ and $B$ independently, which lead to this high a correlation, in which the results arecertain to be opposite when $\phi_A = \phi_B$, certain to beequal when $\phi_A = \phi_B + 180^{\circ}$, and also, for example, have a $\sin^2(60^{\circ}) = 3/4$ chance of being equal when $\phi_A - \phi_B = 120^{\circ}$. Feynman (1982) gives a particularly clear analysis. At $\phi_A - \phi_B = 120^{\circ}$ the highest correlation which local hidden variablescould produce is $2/3$. The Bell-EPR argument allows us to identify a task which is physically possible, but which no classical computer could perform: when repeatedly given inputs $\phi_A$, $\phi_B$ at completely separated locations, respond quickly (i.e. too quick to allow light-speed communication between the locations) with yes/no responses which are perfectly correlated when $\phi_A = \phi_B + 180^{\circ}$, anticorrelated when $\phi_A = \phi_B$,and more than $\sim 70\%$ correlated when $\phi_A - \phi_B = 120^{\circ}$. Experimental tests of Bell's argument were carried out in the 1970's and 80's and the quantum theory was verified (Clauser and Shimony 1978, Aspect {\em et. al.} 1982; for more recent work see Aspect (1991),Kwiat {\em et. al.} 1995 and references therein). This was a significant new probe into the logical structure of quantum mechanics. The argument can be made even stronger by considering a more complicated system. In particular, for three spins prepared in a state such as $(\ket{\uparrow}\ket{\uparrow} \ket{\uparrow} + \ket{\downarrow}\ket{\downarrow}\ket{\downarrow}) / \sqrt{2}$, Greenberger, Horne and Zeilinger (1989) (GHZ) showed that a single measurement along a horizontal axis for two particles, and along a vertical axis for the third, will yield with certainty a result which is the exact opposite of what a local hidden-variable theory would predict. A wider discussion and references are provided by Greenberger {\em et. al.} (1990), Mermin (1990).The Bell-EPR correlations show that quantum mechanics permitsat least one simple task which is beyond the capabilities ofclassical computers, and they hint at a new type of mutual information(Schumacher and Nielsen 1996).In order to pursue these ideas, we will need to construct a complete theory of quantum information.\section{Quantum Information}Just as in the discussion of classical information theory, quantuminformation ideas are best introduced by stating them, andthen showing afterwards how they link together. Quantum communication is treated in a special issue of {\em J. Mod. Opt.},volume 41 (1994); reviews and references for quantumcryptography are given by Bennett {\em et. al.} (1992);Hughes {\em et. al.} (1995); Phoenix andTownsend (1995); Brassard and Crepeau (1996); Ekert (1997). Spiller (1996) reviews both communication and computing. \subsection{Qubits}The elementary unit of quantum information is the {\em qubit}(Schumacher 1995).A single qubit can be envisaged as a two-state system such as a spin-halfor a two-level atom (see fig. 12), but when we measure quantum information in qubits we are really doing something more abstract: a quantum system is said to have $n$ qubits if it has a Hilbert space of $2^n$ dimensions, and so has available $2^n$ {\em mutually orthogonal} quantum states (recall that $n$ classical bits can represent up to $2^n$ different things). This definition of the qubit will be elaborated in section \ref{s:qdc}. We will write two orthogonal states of a single qubit as $\{ \ket{0},\ket{1} \}$. More generally, $2^n$ mutually orthogonalstates of $n$ qubits can be written $\{ \ket{i} \}$, where $i$ isan $n$-bit binary number. For example, for three qubits wehave$\{ \ket{000},\ket{001}, \ket{010}, \ket{011},$ $\ket{100},\ket{101}, \ket{110}, \ket{111} \}$.\subsection{Quantum gates}Simple unitary operations on qubits are called quantum `logic gates'(Deutsch 1985, 1989).For example, if a qubit evolves as $\ket{0} \rightarrow \ket{0}$,$\ket{1} \rightarrow \exp(i\omega t)\ket{1}$, then after time$t$ we may say that the operation, or `gate'  \beqP(\theta) = \left( \begin{array}{cc}1 & 0 \\0 & e^{i \theta} \end{array} \right)  \eeqhas been applied to the qubit, where $\theta = \omega t$. This can also be written $P(\theta) = \ket{0}\bra{0}+ \exp(i\theta) \ket{1}\bra{1}$. Here are some otherelementary quantum gates:  \begin{eqnarray}I &\equiv& \ket{0}\bra{0} + \ket{1}\bra{1} \;\; = \mbox{identity} \\X &\equiv& \ket{0}\bra{1} + \ket{1}\bra{0} \;\; = \mbox{\sc not}\\Z &\equiv& P(\pi) \\Y &\equiv& X Z \\H &\equiv& \frac{1}{\sqrt{2}}\left[ \rule{0em}{1.3em}\left(\ket{0} + \ket{1}\right)\bra{0} + \left(\ket{0} - \ket{1}\right)\bra{1} \right]  \end{eqnarray}these all act on a single qubit, and can be achieved by the actionof some Hamiltonian in Schr\"odinger's equation, since theyare all unitary operators\footnote{The letter $H$ is adoptedfor the final gate here because its effect isa {\em Hadamard} transformation. This is not to be confused with theHamiltonian ${\cal H}$.}. There are an infinite number of single-qubitquantum gates, in contrast to classical information theory,where only two logic gates are possible for a single bit, namelythe identity and the logical {\sc not} operation. The quantum {\sc not}gate carries $\ket{0}$ to $\ket{1}$ and vice versa, and sois analagous to a classical {\sc not}. This gate is also called$X$ since it is the Pauli $\sigma_x$ operator. Note that the set$\{ I, X, Y, Z \}$ is a group under multiplication. Of all the possible unitary operators acting on a pair of qubits,an interesting subset is those which can be written$\ket{0}\bra{0}\otimes I + \ket{1}\bra{1}\otimes U$, where $I$ is thesingle-qubit identity operation, and $U$ is some other single-qubitgate. Such a two-qubit gate is called a ``controlled $U$''gate, since the action $I$ or $U$ on the second qubit is controlledby whether the first qubit is in the state $\ket{0}$ or $\ket{1}$.For example, the effect of controlled-{\sc not} (``{\sc cnot}'') is  \begin{eqnarray}\ket{00} &\rightarrow& \ket{00} \nonumber \\\ket{01} &\rightarrow& \ket{01} \nonumber \\\ket{10} &\rightarrow& \ket{11} \nonumber \\\ket{11} &\rightarrow& \ket{10}       \label{cnot}  \end{eqnarray}Here the second qubit undergoes a {\sc not} if and only if the firstqubit is in the state $\ket{1}$.This list of state changes is the analogue of the truth tablefor a classical binary logic gate. The effect of controlled-{\sc not} acting on a state$\ket{a}\ket{b}$ can be written$a \rightarrow a$, $b \rightarrow a \oplus b$, where$\oplus$ signifies the exclusive or ({\sc xor}) operation.For this reason, this gate is also called the {\sc xor} gate.Other logical operations require further qubits. For example, the {\sc and} operation is achieved by use of the 3-qubit ``controlled-controlled-{\sc not}'' gate, in which the third qubit experiences {\sc not} if and only if both the others are in the state $\ket{1}$. This gate is named a Toffoli gate, after Toffoli (1980)who showed that the classical version is universal for classicalreversible computation.The effect on a state $\ket{a}\ket{b}\ket{0}$is $a \rightarrow a, b \rightarrow b, 0 \rightarrow a \cdot b$.In other words if the third qubit is prepared in $\ket{0}$ thenthis gate computes the {\sc and} of the first two qubits. The useof three qubits is necessary in order to permit the whole operationto be unitary, and thus allowed in quantum mechanical evolution.It is an amusing excercise to find the combinations of gates which perform elementary arithmatical operations such as binary addition and multiplication. Many basic constructions are given by Barenco {\em et. al.} (1995b), further general design considerations are discussedby Vedral {\em et. al.} (1996) and Beckman {\em et. al.} (1996).The action of a sequence of quantum gates can be written inoperator notation, for example $X_1 H_2 \mbox{\sc xor}_{1,3} \ket{\phi}$where $\ket{\phi}$ is some state of three qubits, and the subscriptson the operators indicate to which qubits they apply. However, oncemore than a few quantum gates are involved, this notation israther obscure, and can usefully be replaced by a diagram knownas a quantum network---see fig. 8. These diagrams will be usedhereafter. \subsection{No cloning}{\em No cloning theorem:} An unknown quantum state cannot be cloned.This states that it is impossible to generate copies of a quantum state reliably, unless the state is already known (i.e. unless there exists classical information which specifies it). Proof: to generate a copy of a quantum state $\ket{\alpha}$, we must cause a pair of quantum systems to undergo the evolution $U (\ket{\alpha} \ket{0}) = \ket{\alpha} \ket{\alpha}$ where $U$ is the unitary evolution operator. If this is to work for any state, then $U$ must not depend on $\alpha$, and therefore $U (\ket{\beta}  \ket{0}) = \ket{\beta}  \ket{\beta}$ for $\ket{\beta} \ne \ket{\alpha}$. However, if we consider the state $\ket{\gamma} = (\ket{\alpha} + \ket{\beta})/\sqrt{2}$, we have $U (\ket{\gamma}  \ket{0}) = (\ket{\alpha}\ket{\alpha} + \ket{\beta}\ket{\beta})/\sqrt{2} \ne \ket{\gamma}\ket{\gamma}$ so the cloning operation fails. This argument applies to any purported cloning method (Wooters and Zurek 1982, Dieks 1982).Note that any given `cloning' operation $U$ can work on some states($\ket{\alpha}$ and $\ket{\beta}$ in the above example), thoughsince $U$ is trace-preserving, two different clonable states mustbe orthogonal, $\left< \alpha \right| \left. \beta \right> = 0$. Unless we already know that the state to be copied is one of these states,we cannot guarantee that the chosen $U$ will correctly clone it. This isin contrast to classical information, where machines likephotocopiers can easily copy whatever classical information issent to them. The controlled-{\sc not} or {\sc xor} operationof equation (\ref{cnot}) is a copying operation for the states$\ket{0}$ and $\ket{1}$, but not for states such as $\ket{+}\equiv (\ket{0} + \ket{1}) / \sqrt{2}$ and $\ket{-}\equiv (\ket{0} - \ket{1}) / \sqrt{2}$.The no-cloning theorem and the EPR paradox together reveal a rather subtle way in which non-relativistic quantum mechanics is a consistent theory. For, if cloning were possible, then EPR correlations could be used to communicate faster than light, which leads to a contradiction (an effect preceding a cause) once the principles of special relativity are taken into account. To see this, observe that by generating many clones, and then measuring them in different bases, Bob could deduce unambiguously whether his member of an EPR pair is in a state of the basis $\{\ket{0}, \ket{1}\}$ or of the basis $\{\ket{+},\ket{-}\}$. Alice would communicate instanteously by forcing the EPR pair into one basis or the other through her choice of measurement axis (Glauber 1986). \subsection{Dense coding}We will discuss the following statement:{\em Quantum entanglement is an information resource.}Qubits can be used to store and transmit classical information.To transmit a classical bit string 00101, for example, Alice can send5 qubits prepared in the state $\ket{00101}$. The receiver Bobcan extract the information by measuring each qubit in the basis$\{ \ket{0}, \ket{1} \}$ (i.e. these are the eigenstates of the measuredobservable). The measurement results yield the classical bit stringwith no ambiguity. No more than one classicalbit can be communicated for each qubit sent.Suppose now that Alice and Bob are in possession of an entangled pair of qubits, in the state $\ket{00} + \ket{11}$ (we will usually dropnormalisation factors such as $\sqrt{2}$ from now on, to keepthe notation uncluttered). Alice and Bob need never have communicated: we imagine a mechanical central facility generating entangled pairs and sending one qubit to each of Alice and Bob, who store them(see fig. 9a). In this situation, Alice can communicate {\em two} classical bits by sending Bob only {\em one} qubit (namely her half of the entangled pair). This idea due to Wiesner (Bennett and Wiesner 1992) is called ``dense coding'', since only  one quantum bit travels fromAlice to Bob in order to convey two classical bits.Two quantum bits are involved, but Alice only ever sees one of them.The method relies on the following fact:the four mutually orthogonal states $\ket{00} + \ket{11},\;\ket{00} - \ket{11}$, $\ket{01} + \ket{10},\;\ket{01} - \ket{10}$can be generated from each other by operations on a single qubit. This setof states is called the Bell basis, since they exhibit the strongestpossible Bell-EPR correlations (Braunstein {\em et. al.} 1992).Starting from$\ket{00} + \ket{11}$, Alice can generate any of the Bell basisstates by operating on her qubit with one of the operators $\{I,X,Y,Z\}$. Since there are four possibilities, her choice of operationrepresents two bits of classical information. She then sends her qubitto Bob, who must deduce which Bell basis state the qubits are in. Thishe does by operating on the pair with the {\sc xor} gate, and measuringthe target bit, thus distinguishing $\ket{00} \pm \ket{11}$ from$\ket{01} \pm \ket{10}$. To find the sign in the superposition, heoperates with $H$ on the remaining qubit, and measures it. HenceBob obtains two classical bits with no ambiguity.Dense coding is difficult to implement, and so has no practical valuemerely as a standard communication method. However, it can permit securecommunication: the qubit sent by Alice will only yield thetwo classical information bits to someone in possession of theentangled partner qubit. More generally, dense coding is an exampleof the statement which began this section.It reveals a relationship between classical information, qubits, and the information content of quantum entanglement (Barenco and Ekert 1995). A laboratory demonstration of the main features is described by Mattle {\em et. al.} (1996); Weinfurter (1994) and Braunstein and Mann (1995) discuss some of the methods employed, based on a source of EPR photon pairs from parametric down-conversion. \subsection{Quantum teleportation}{\em It is possible to transmit qubits without sending qubits!}Suppose Alice wishes to communicate to Bob a single qubit in the state$\ket{\phi}$. If Alice already knows what state she has, forexample $\ket{\phi} = \ket{0}$, she can communicate it to Bobby sending just classical information, eg ``Dear Bob, I have thestate $\ket{0}$. Regards, Alice.'' However, if $\ket{\phi}$ isunknown there is no way for Alice to learn it withcertainty: any measurement she may perform may change the state,and she cannot clone it and measure the copies. Hence it appearsthat the only way to transmit $\ket{\phi}$ to Bob is to sendhim the physical qubit (i.e. the electron or atom or whatever), orpossibly to swap the state into another quantum system and sendthat. In either case a quantum system is transmitted.Quantum teleportation (Bennett {\em et. al.} 1993, Bennett 1995)permits a way around this limitation.As in dense coding, we will use quantum entanglement as an informationresource. Suppose Alice and Bob possess an entangled pair in thestate $\ket{00} + \ket{11}$. Alice wishes to transmit to Boba qubit in an unknown state $\ket{\phi}$. Without loss ofgenerality, we can write $\ket{\phi} = a \ket{0} + b \ket{1}$where $a$ and $b$ are unknown coefficients. Then the initialstate of all three qubits is   \beqa\ket{000} + b\ket{100} + a\ket{011} + b\ket{111}  \eeqAlice now measures in the Bell basisthe first two qubits, i.e. the unknownone and her member of the entangled pair. The network to do thisis shown in fig. 9b. After Alice has applied the{\sc xor} and Hadamard gates, and just before she measures herqubits, the state is  \begin{eqnarray}\lefteqn{}&&   \ket{00}\left( a\ket{0} + b \ket{1}\right)+ \ket{01}\left( a\ket{1} + b \ket{0}\right)         \nonumber \\&& \rule{-2ex}{0em} + \ket{10}\left( a\ket{0} - b \ket{1}\right)+ \ket{11}\left( a\ket{1} - b \ket{0}\right).  \end{eqnarray}Alice's measurements collapse the state onto one of fourdifferent possibilities, and yield two classical bits. Thetwo bits are sent to Bob, who uses them to learn whichof the operators $\{I,X,Z,Y\}$ he must apply to his qubit inorder to place it in the state $a\ket{0} + b \ket{1} = \ket{\phi}$. ThusBob ends up with the qubit (i.e. the quantum information, notthe actual quantum system) which Alice wished to transmit.Note that the quantum information can only arrive at Bob if it disappears from Alice (no cloning). Also, quantum information is complete information: $\ket{\phi}$ is the complete description of Alice's qubit. The use of the word `teleportation' draws attention to these two facts. Teleportationbecomes an especially important idea when we come to consider communicationin the presence of noise, section \ref{s:qec}. \subsection{Quantum data compression}  \lab{s:qdc}Having introduced the qubit, we now wish to show that it is a useful measure of quantum information content. The proof of this is due to Jozsa and Schumacher (1994) and Schumacher (1995), building on work of Kholevo (1973) and Levitin (1987). To begin the argument, we first need a quantity which expresses how much information you would gain if you were to learn the quantum state of some system $\cal Q$. A suitable quantity is the Von Neumann entropy   \beq    S(\rho) = - {\rm Tr} \rho \log \rho  \eeqwhere Tr is the trace operation, and $\rho$ is the density operatordescribing an ensemble of states of the quantum system. This isto be compared with the classical Shannon entropy, equation (\ref{S}).Suppose a classical random variable $X$ has aprobability distribution $p(x)$. If a quantum system isprepared in a state $\ket{x}$ dictated by the value of $X$,then the density matrix is $\sum_x p(x)\ket{x} \bra{x}$, where the states $\ket{x}$ need not be orthogonal. It can be shown (Kholevo 1973, Levitin 1987) that $S(\rho)$ is an upper limit on the classical mutual information $I(X:Y)$ between $X$ and the result $Y$ of a measurement on the system. To make connection with qubits, we consider the resources neededto store or transmit the state of a quantum system $q$ ofdensity matrix $\rho$. The idea is to collect $n \gg 1$ such systems, andtransfer (`encode') the joint state into some smaller system.The smaller system is transmitted down the channel, and atthe receiving end the joint state is `decoded' into $n$ systems$q'$ of the same type as $q$ (see fig. 9c).The final density matrix of each $q'$ is $\rho'$, and thewhole process is deemed successful if $\rho'$ is sufficiently closeto $\rho$. The measure of the similarity between two densitymatrices is the {\em fidelity} defined by  \beqf(\rho, \rho') =\left( {\rm Tr} \sqrt{\rho^{1/2} \rho' \rho^{1/2} } \right)^2  \eeqThis can be interpreted as the probability that $q'$ passesa test which ascertained if it was in the state $\rho$. When$\rho$ and $\rho'$ are both pure states, $\ket{\phi}\bra{\phi}$and $\ket{\phi'}\bra{\phi'}$, the fidelity isnone other than the familiar overlap: $f = | \left< \phi \right| \left. \phi' \right> |^2$.Our aim is to find the smallest transmitted system which permits $f = 1 - \epsilon$ for $\epsilon \ll 1$. The argument is analogous to the `typical sequences' idea used in section \ref{s:dc}. Restricting ourselves for simplicity to two-state systems, the total state of $n$ systems is represented by a vector in a Hilbert space of $2^n$ dimensions. However, if the von Neumann entropy $S(\rho) < 1$ then it is highly likely (i.e. tends to certainty in the limit of large $n$) that, in any given realisation, the state vector actually falls in a {\em typical sub-space} of Hilbert space. Schumacher and Jozsa showed that the dimension of the typical sub-space is $2^{n S(\rho)}$. Hence only $n S(\rho)$ qubits are required to represent the quantum information faithfully, and the qubit (i.e. the logarithm of the dimensionality of Hilbert space) is a useful measure of quantum information. Furthermore, the encoding and decoding operation is `blind': it does not depend on knowledge of the exact states being transmitted. Schumacher and Josza's result ispowerful because it is general: no assumptions are made aboutthe exact nature of the quantum states involved. In particular, they need not be orthogonal. If the states to be transmitted were mutually orthogonal,the whole problem would reduce to one of classical information.The `encoding' and `decoding' required to achieve such quantum data compression and decompression is technologically very demanding. It cannot at present be done at all using photons. However, it is the ultimate compression allowed by the laws of physics. The details of the required quantum networks have been deduced by Cleve and DiVincenzo (1996). As well as the essential concept of information, other classical ideas such as Huffman coding have their quantum counterparts. Furthermore, Schumacher and Nielson (1996) derive a quantity which they call `coherent information' which is a measure of mutual information for quantum systems.It includes that part of the mutual information between entangled systems which cannot be accounted for classically. This is a helpful way to understand the Bell-EPR correlations. \subsection{Quantum cryptography} No overview of quantum information is complete without a mentionof quantum cryptography. This area stems from an unpublished paperof Wiesner written around 1970 (Wiesner 1983). It includes variousideas wherebythe properties of quantum systems are used to achieve usefulcryptographic tasks, such as secure (i.e. secret) communication. The subjectmay be divided into quantum {\em key distribution}, and a collection ofother ideas broadly related to {\em bit commitment}. Quantum keydistribution will be outlined below. Bit commitment refers to the scenario in which Alice must make some decision, such as avote, in such a way that Bob can be sure that Alice fixed her vote beforea given time, but where Bobcan only learn Alice's vote at some later time which she chooses. A classical, cumbersome method to achieve bit commitmentis for Alice to write down her voteand place it in a safe which she gives to Bob. When she wishes Bob, later,to learn the information, she gives him the key to the safe. A typicalquantum protocol is a carefully constructed variation on the ideathat Alice provides Bob witha prepared qubit, and only later tells him in what basis it was prepared.The early contributionsto the field of quantum cryptographywere listed in the introduction, further references may befound in the reviews mentioned at the beginning of this section.Cryptography has the unusual feature that it is not possible to prove byexperiment that a cryptographic procedure is secure: who knows whether a spyor cheating person managed to beat the system? Instead, the users' confidencein the methods must rely on mathematical proofs of security, and it ishere that much important work has been done. A concerted efforthas enabled proofs to be established for the securityof correctly implemented quantum key distribution. However, thebit commitment idea, long thought to be secure through quantummethods, was recently proved to be insecure(Mayers 1997, Lo and Chau 1997) because the participants cancheat by making use of quantum entanglement.Quantum key distribution is a method in which quantum statesare used to establish a random secret key for cryptography. The essentialideas are as follows: Alice and Bob are, as usual, widely seperatedand wish to communicate. Alice sends to Bob $2n$ qubits,each prepared inone of the states $\ket{0},\ket{1},\ket{+},\ket{-}$, randomlychosen\footnote{Many other methods are possible, we adopt thisone merely to illustrate the concepts.}.Bob measures his received bits, choosing the measurement basis randomly between $\{\ket{0},\ket{1}\}$ and $\{\ket{+},\ket{-}\}$.Next, Alice and Bob inform each other publicly (i.e. anyone can listen in)of the basis they used to prepare or measure each qubit. Theyfind out on which occasions they by chance used the same basis, whichhappens on average half the time, and retain just those results.In the absence of errors or interference, they now share the samerandom string of $n$ classical bits (they agree for exampleto associate $\ket{0}$ and $\ket{+}$ with 0; $\ket{1}$ and $\ket{-}$with 1). This classical bit string is often called the {\em raw quantumtransmission}, RQT.So far nothing has been gained by using qubits. The importantfeature is, however, that it is impossible for anyone to learnBob's measurement results by observing the qubits {\em en route},without leaving evidence of their presence. The crudest wayfor an eavesdopper Eve to attempt to discover the key would befor her to intercept the qubits and measure them, then pass them on to Bob.On average half the time Eve guesses Alice's basis correctly and thusdoes not disturb the qubit. However, Eve's correct guesses do not coincide with Bob's, so Eve learns the state of half of the $n$ qubits whichAlice and Bob later decide to trust, and disturbs the other half, for example sending to Bob$\ket{+}$ for Alice's $\ket{0}$. Half of those disturbed will beprojected by Bob's measurement back onto the original statesent by Alice, so overall Eve corrupts $n/4$ bits of the RQT.Alice and Bob can now detect Eve's presence simply by randomly choosing $n/2$ bits of the RQT and announcing publicly the values they have. If they agree on all these bits, then they can trust that no eavesdropper was present, since the probability that Eve was present and they happened to choose $n/2$ uncorrupted bits is $(3/4)^{n/2} \simeq 10^{-125}$ for $n=1000$. The $n/2$ undisclosed bits form the secret key.In practice the protocol is more complicated since Eve might adoptother strategies (e.g. not intercept all the qubits), and noise will currupt some of the qubits even in the absence of an evesdropper. Instead of rejecting the key if many of the disclosed bits differ, Alice and Bob retain it as long as they find the error rate to be well below $25\%$. They then process the key in two steps. The first is to detect and remove errors, which is done by publicly comparing parity checks on publicly chosen random subsets of the bits, while discarding bits to prevent increasing Eve's information. The second step is to decrease Eve's knowledge of the key, by distilling from it a smaller key, composed of parity values calculated from the original key. In this way a key of around $n/4$ bits is obtained, of which Eve probably knows less than $10^{-6}$ of one bit (Bennett {\em et. al.} 1992). The protocol just described is not the only one possible. Another approach (Ekert 1991) involves the use of EPR pairs, which Alice and Bob measure along one of three different axes. To rule out eavesdropping they check for Bell-EPR correlations in their results. The great thing about quantum key distribution is that it is feasible with current technology. A pioneering experiment (Bennett and Brassard 1989) demonstrated the principle, and much progress has been made since then. Hughes {\em et. al.} (1995) and Phoenix and Townsend (1995) summarised the state of affairs two years ago, and recently Zbinden {\em et. al.} (1997) have reported excellent key distribution through 23 km of standard telecom fibre under lake Geneva. The qubits are stored in the polarisation states of laser pulses, i.e. coherent states of light, with on average $0.1$ photons per pulse. This low light level is necessary so that pulses containing more than one photon are unlikely. Such pulses would provide duplicate qubits, and hence a means for an evesdropper to go undetected. The system achievesa bit error rate of $1.35\%$, which is low enough to guarantee privacy in the full protocol. The data transmission rate is rather low: MHz as opposed to the GHz rates common in classical communications, but the system is very reliable. Such spectacular experimental mastery isin contrast to the subject of the next section.\section{The universal quantum computer}  \lab{s:uqc}We now have sufficient concepts to understand the jewel at theheart of quantum information theory, namely, the quantum computer (QC).Ekert and Jozsa (1996) and Barenco (1996) giveintroductory reviews concentrating on the quantumcomputer and factorisation; a review with emphasis on practicalitiesis provided by Spiller (1996). Introductory material is also providedby DiVincenzo (1995b) and Shor (1996).The QC is first and foremost a machine which is a theoretical construct, like a thought-experiment, whose purpose is to allow quantum information processing to be formally analysed. In particular it establishesthe Church-Turing Principle introduced in section \ref{s:qvc}.Here is a prescription for a quantum computer, based on that ofDeutsch (1985, 1989):A quantum computer is a set of $n$ qubits in which the followingoperations are experimentally feasible:  \begin{enumerate}\item Each qubit can be prepared in some known state $\ket{0}$.\item Each qubit can be measured in the basis $\{ \ket{0}, \ket{1} \}$.\item A universal quantum gate (or set of gates) can be appliedat will to any fixed-size subset of the qubits.\item The qubits do not evolve other than via the above transformations.  \end{enumerate}This prescription is incomplete in certain technical ways to bediscussed, but it encompasses the main ideas. The model ofcomputation we have in mind is a network model, in which logicgates are applied sequentially to a set of bits (here, quantumbits). In an electronic classical computer, logic gates arespread out in space on a circuit board, but in the QCwe typically imagine the logic gates to be interactions turnedon and off in time, with the qubits at fixed positions,as in a quantum network diagram (fig. 8, 12).Other models of quantum computation can be conceived, suchas a cellular automaton model (Margolus 1990).\subsection{Universal gate}The universal quantum gate is the quantum equivalent of theclassical universal gate, namely a gate which by its repeated use on different combinations of bitscan generate the action of any other gate. What is the setof all possible quantum gates, however? To answerthis, we appeal to the principles of quantum mechanics(Schr\"odinger's equation), and answer that since all quantum evolution is unitary,it is sufficient tobe able to generate {\em all unitary transformations}of the $n$ qubits in the computer. This might seem a tallorder, since we have a continuous and therefore infiniteset. However, it turns out that quite simple quantum gatescan be universal, as Deutsch showed in 1985. The simplest way to think about universal gates isto consider the pair of gates $V(\theta, \phi)$ andcontrolled-not (or {\sc xor}), where $V(\theta, \phi)$is a general rotation of a single qubit, ie  \beqV(\theta, \phi) = \left( \begin{array}{lr}\cos (\theta/2) & -i e^{-i\phi} \sin (\theta/2) \\ -i e^{i\phi} \sin (\theta/2) & \cos (\theta/2) \end{array} \right).    \lab{V}  \eeqIt can be shown that any $n \times n$ unitary matrix canbe formed by composing 2-qubit {\sc xor} gates and single-qubitrotations. Therefore, this pair of operations is universal for quantum computation. A purist may argue that $V(\theta, \phi)$ is an infinite set of gates since the parameters $\theta$ and $\phi$ are continuous, but it suffices to choose two particular irrational angles for $\theta$ and $\phi$, and the resulting single gate can generate all single-qubit rotations by repeated application; however, a practical system need not usesuch laborious methods. The {\sc xor} and rotation operations can be combined to make a controlled rotation which is a single universal gate. Such universal quantum gates were discussed by Deutsch {\em et. al.} (1995),Lloyd (1995), DiVincenzo (1995a) and Barenco (1995). It is remarkable that 2-qubit gates are sufficient for quantumcomputation. This is why the quantum gate is a powerfuland important concept. \subsection{Church-Turing principle}Having presented the QC, it is necessary to argue for its universality, i.e. that it fulfills the Church-Turing Principle as claimed. The two-step argument is very simple. First, the state of any finite quantum system is simply a vector in Hilbert space, and therefore can be represented to arbitrary precision by a finite number of qubits. Secondly, the evolution of any finite quantum system is a unitary transformation of the state, and therefore can be simulated on the QC, which can generate any unitary transformation with arbitrary precision. A point of principle is raised by Myers (1997), who points out thatthere is a difficulty with computational tasks for which the numberof steps for completion cannot be predicted. We cannot in generalobserve the QC to find out if it has halted, in contrast to aclassical computer. However, we will only be concerned with taskswhere either the number of steps is predictable, or theQC can signal completion by setting a dedicated qubit whichis otherwise not involved in the computation (Deutsch 1985). This is a very broad class of problems. Nielsen and Chuang (1997) considerthe use of a {\em fixed} quantum gate array, showing that there is noarray which, operating on qubits representingboth data and program, can perform any unitary transformationon the data. However, we consider a machine in which a classical computer controls the quantum gates applied to a quantumregister, so any gate array can be `ordered' by a classical programto the classical computer.The QC is certainly an interesting theoreticaltool. However, there hangs over it a large and important question-mark: what about imperfection? The prescription given above is written as if measurements and gates can be applied witharbitrary precision, which is unphysical, as is the fourthrequirement (no extraneous evolution). The prescription canbe made realistic by attaching to each of the four requirementsa statement about the degree of allowable imprecision.This is a subject of on-going research, and we will take itup in section \ref{s:qec}. Meanwhile, let us investigate morespecifically what a sufficiently well-made quantum computer might do.\section{Quantum algorithms}  \lab{s:qa}It is well known that classical computers are able to calculatethe behaviour of quantum systems, so we have not yet demonstratedthat a quantum computer can do anything which a classical computercan not. Indeed, since our theories of physics always involveequations which we can write down and manipulate, it seems highly unlikelythat quantum mechanics, or any future physical theory, would permitcomputational problems to be addressed which are notin principle solvable on a large enough classical Turing machine.However, as we saw in section \ref{s:cc}, thosewords `large enough', and also `fast enough', are centrally importantin computer science. Problems which are computationally `hard' can be impossible in practice. In technical language, while quantum computing does not enlarge the set of computational problems which can be addressed (compared to classical computing), it does introduce the possibility of new complexity classes. Put more simply, tasks for whichclassical computers are too slow may be solvable with quantum computers.\subsection{Simulation of physical systems}The first and most obvious application of a QC isthat of simulating some other quantum system. To simulate astate vector in a $2^n$-dimensional Hilbert space, a classicalcomputer needs to manipulate vectors containing of order $2^n$complex numbers, whereas a quantum computer requires just $n$qubits, making it much more efficient in storage space. To simulate evolution, in general both the classical andquantum computers will be inefficient. A classical computermust manipulate matrices containing of order $2^{2n}$ elements, which requiresa number of operations (multiplication, addition) exponentiallylarge in $n$, while a quantum computer must build unitaryoperations in $2^n$-dimensional Hilbert space, which usually requires an exponentially large number of elementaryquantum logic gates. Therefore the quantum computer is not guaranteed to simulate {\em every} physical system efficiently. However, it can be shownthat it can simulate a largeclass of quantum systems efficiently, including many forwhich there is no efficient classical algorithm, suchas many-body systems with local interactions (Lloyd 1996,Zalka 1996, Wiesner 1996, Meyer 1996, Lidar and Biam 1996, Abramsand Lloyd 1997, Boghosian and Taylor 1997). \subsection{Period finding and Shor's factorisation algorithm}So far we have discussed simulation of Nature, which is a rather restrictedtype of computation. We would like to let the QC looseon more general problems, but it has so far proved hard to find oneson which it performs better than classical computers. However, thefact that there exist such problems at all is a profound insight intophysics, and has stimulated much of the recent interest in the field.Currently one of the most important quantum algorithms is that forfinding the period of a function. Suppose a function $f(x)$ is periodic with period $r$, i.e. $f(x) = f(x + r)$. Suppose further that $f(x)$ can be efficiently computed from $x$, and all we know initially is that $N/2 < r < N$ for some $N$. Assuming there is no analytic technique to deduce the period of $f(x)$, the best we can do on a classical computer is to calculate $f(x)$ for of order $N/2$ values of $x$, and find out when the functionrepeats itself (for well-behaved functions only $O(\sqrt{N})$ values maybe needed on average). This is inefficient since the number of operations is exponential in the input size $\log N$(the information required to specify $N$).The task can be solved efficiently on a QC by the elegantmethod shown in fig. 10, due to Shor (1994), building on Simon (1994). TheQC requires $2n$ qubits, plus a further $0(n)$ for workspace,where $n = \lceil 2 \log N \rceil$ (the notation $\lceil x \rceil$ means the nearest integer greater than $x$). These are divided into two `registers', each of $n$ qubits. They will be referred to as the $x$ and $y$ registers; both are initially prepared in the state $\ket{0}$ (i.e. all $n$ qubits in states $\ket{0}$). Next, the operation $H$ is applied to each qubit in the $x$ register, making the total state   \beq\frac{1}{\sqrt{w}} \sum_{x=0}^{w-1} \ket{x} \ket{0}  \label{step1}  \eeqwhere $w = 2^n$. This operation is referred to as a Fourier transformin fig. 10, for reasons that will shortly become apparant. The notation $\ket{x}$ means a state such as $\ket{0011010}$,where $0011010$ is the integer $x$ in binary notation. In thiscontext the basis $\{ \ket{0}, \ket{1} \}$ is referred to as the`computational basis.' It is convenient (though not of course necessary)to use this basis when describing the computer.Next,a network of logic gates is applied to both $x$ and $y$ regisiters,to perform the transformation $U_f \ket{x} \ket{0} = \ket{x} \ket{f(x)}$.Note that this transformation can be unitary because the inputstate $\ket{x} \ket{0}$ is in one to one correspondance with theoutput state $\ket{x} \ket{f(x)}$, so the process is reversible. Now, applying $U_f$ to the state given in eq. (\ref{step1}), weobtain  \beq\frac{1}{\sqrt{w}} \sum_{x=0}^{w-1} \ket{x} \ket{f(x)}  \label{step2}  \eeqThis state is illustrated in fig. 11a.At this point something rather wonderful has taken place: the value of$f(x)$ has been calculated for $w=2^n$ values of $x$, allin one go! This feature is referred to as {\em quantum parallelism}and represents a huge parallelism because of the exponentialdependence on $n$ (imagine having $2^{100}$, i.e. a milliontimes Avagadro's number, of classical processors!)Although the $2^n$ evaluations of $f(x)$ are in some sense `present'in the quantum state in eq. (\ref{step2}), unfortunately wecannot gain direct access to them. For, a measurement (in the computational basis) of the$y$ register, which is the next step in the algorithm, willonly reveal one value of $f(x)$\footnote{It is not strictlynecessary to measure the $y$ register, but this simplifiesthe description.}. Suppose the value obtainedis $f(x) = u$. The $y$ register state collapses onto $\ket{u}$,and the total state becomes  \beq\frac{1}{\sqrt{M}} \sum_{j=0}^{M-1} \ket{d_u+jr} \ket{u} \label{step3}  \eeqwhere $d_u + j r$, for $j=0,1,2 \ldots M-1$, are all the values of$x$ for which $f(x) = u$. In other words the periodicity of$f(x)$ means that the $x$ register remainsin a superposition of $M \simeq w/r$ states, at values of$x$ separated by the period $r$. Note that the offset $d_u$of the set of $x$ values depends on the value $u$ obtainedin the measurement of the $y$ register.It now remains to extract the periodicity of the state in the$x$ register. This is done by applying a Fourier transform,and then measuring the state. The discrete Fourier transform employed is the following unitary process:   \beqU_{\cal FT} \ket{x} = \frac{1}{\sqrt{w}} \sum_{k=0}^{w-1}e^{i 2\pi k x /w} \ket{k}  \eeqNote that eq. (\ref{step1}) is an example of this, operatingon the initial state $\ket{0}$. The quantum network to apply $U_{\cal FT}$ is based on thefast Fourier transform algorithm (see, e.g., Knuth (1981)). The quantumversion was worked out by Coppersmith (1994) and Deutsch (1994)independently, a clear presentation may also be foundin Ekert and Josza (1996), Barenco (1996)\footnote{An exactquantum Fourier transform would require rotation operationsof precision exponential in $n$, which raises a problem with theefficiency of Shor's algorithm. However, an approximate versionof the Fourier transform is sufficient (Barenco {\em et. al.} 1996)}.Before applying $U_{\cal FT}$ to eq. (\ref{step3})we will make the simplifying assumption that $r$ divides $w$exactly, so $M = w/r$. The essential ideas are notaffected by this restriction; when it is relaxed some addedcomplications must be taken into account (Shor 1994, 1995a; Ekert and Josza 1996). The $y$ register no longer concerns us, so we will just considerthe $x$ state from eq. (\ref{step3}):  \beqU_{\cal FT} \frac{1}{\sqrt{w/r}} \sum_{j=0}^{w/r-1} \ket{d_u + jr}   = \frac{1}{\sqrt{r}}\sum_k \tilde{f} (k) \ket{k}     \label{step4}  \eeqwhere  \beq|\tilde{f} (k)| = \left\{ \begin{array}{ll} 1 \;\;\; & \mbox{if $k$ is amultiple of $w/r$} \\0 & \mbox{otherwise}  \end{array} \right.       \eeqThis state is illustrated in fig. 11b.The final state of the $x$ register is now measured, and we seethat the value obtained must be a multiple of $w/r$. It remainsto deduce $r$ from this. We have $x = \lambda w/r$where $\lambda$ is unknown. If $\lambda$ and $r$ have nocommon factors, then we cancel $x/w$ down to an irreducible fraction and thus obtain $\lambda$ and $r$. If $\lambda$ and $r$ have a common factor, which is unlikely for large $r$, then the algorithm fails. In this case, the whole algorithm must be repeated from the start. After a number of repetitions no greater than $\sim \log r$, and usually muchless than this, the probability of success can be shown to be arbitrarily close to $1$ (Ekert and Josza 1996). The quantum period-finding algorithm we have described is efficient as long as $U_f$, the evaluation of $f(x)$, is efficient. The total number of elementary logic gates required is a polynomial rather than exponential function of $n$. As was emphasised in section \ref{s:cc}, this makes all the difference between tractable and intractable in practice, for sufficiently large $n$. To add the icing on the cake, it can be remarked that the importantfactorisation problem mentioned in section \ref{s:cc} can bereduced to one of finding the period of a simple function. Thisand all the above ingredients were first brought together by Shor (1994),who thus showed that the factorisation problem is tractableon an ideal quantum computer. The function to be evaluated in thiscase is $f(x) = a^x \;{\rm mod}\;N$ where $N$ is the number tobe factorised, and $a < N$ is chosen randomly. One can showusing elementary number theory (Ekert and Josza 1996) that for most choices of $a$, the period $r$ is even and $a^{r/2} \pm 1$ shares a common factor with $N$. The common factor (which is of course a factor $N$) can then be deduced rapidly using a classical algorithm due to Euclid ({\em circa} 300 BC; see, e.g. Hardy and Wright 1965). To evaluate $f(x)$ efficiently, repeated squaring (modulo $N$) is used, giving powers $((a^2)^2)^2 \ldots$. Selected such powers of $a$, corresponding to the binary expansion of $a$, are then multiplied together. Complete networks for the whole of Shor's algorithm were describedby Miquel {\em et. al.} (1996), Vedral {\em et. al.} (1996) and Beckman{\em et. al.} (1996). They require of order $300 (\log N)^3$ logic gates.Therefore, to factorise numbers of order $10^{130}$, i.e. at the limit of current classical methods, would require $\sim 2 \times 10^{10}$ gates per run, or 7 hours if the `switching rate' is one megaHertz\footnote{The algorithm might need to berun $\log r \sim 60$ times to ensure at least one successful run, butthe average number of runs required will be much less than this.}.Considering how difficult it is to make a quantum computer, this offers no advantage over classical computation. However, if we double the number of digits to 260 then the problem is intractable classically (see section \ref{s:cc}), while the ideal quantum computer takes just 8 times longer than before. The existence of such a powerful method is an exciting and profound new insight into quantum theory. The period-finding algorithm appears at first sight like a conjuring trick: it is not quite clear how the quantum computer managed to produce the period like a rabbit out of a hat. Examining fig. 11 and equations (\ref{step1}) to (\ref{step4}), I would say that the most important features are contained in eq. (\ref{step2}). They are not only the {\em quantum parallelism} already mentioned, but also {\em quantum entanglement}, and, finally, quantum interference. Each value of $f(x)$ retains a link with the value of $x$ which produced it, through the entanglement of the $x$ and $y$ registers in eq. (\ref{step2}). The `magic' happens when a measurement of the $y$ register produces the special state $\ket{\psi}$ (eq. \ref{step3}) in the $x$ register, and it is quantum entanglement which permits this (see also Jozsa 1997a). The final Fourier transform can be regarded as an interference between the various superposed states in the $x$ register (compare with the action of a diffraction grating). Interference effects can be used for computational purposes with classical light fields, or water waves for that matter, so interference is not in itself the essentially quantum feature. Rather, the exponentially large number of interfering states, and the entanglement, are featureswhich do not arise in classical systems. \subsection{Grover's search algorithm}Despite considerable efforts in the quantum computing community, the numberof useful quantum algorithms which have been discovered remains small.They consist mainly of variants on the period-finding algorithm presentedabove, and another quite different task: that of searching an unstructuredlist. Grover (1997) presented a quantum algorithm for the followingproblem: given an unstructured list of items $\{ x_i \}$, finda particular item $x_j = t$. Think, for example, of looking for a particulartelephone number in the telephone directory (for someone whose nameyou do not know). It is not hard to prove that classicalalgorithms can do no better than searching through the list, requiringon average $N/2$ steps, for a list of $N$ items. Grover's algorithmrequires of order $\sqrt{N}$ steps. The task remains computationally hard: it is not transferred to a new complexity class, but it is remarkable that such a seemingly hopeless task can be speeded up at all. The `quantum speed-up' $\sim \sqrt{N}/2$ is greater than that achievedby Shor's factorisation algorithm ($\sim \exp( 2 (\ln N)^{1/3} )$), andwould be important for thehuge sets ($N \simeq 10^{16}$) which can arise, for example, incode-breaking problems (Brassard 1997). An important further point was proved by Bennett {\em et. al.} (1997), namelythat Grover's algorithm is optimal: no quantum algorithm can do betterthan $O(\sqrt{N})$. A brief sketch of Grover's algorithm is as follows. Each itemhas a label $i$, and we must be able to test in a unitaryway whether any item is the one we are seeking. In otherwords there must exist a unitary operator $S$ such that $S \ket{i}= \ket{i}$ if $i \ne j$, and $S \ket{j} = - \ket{j}$, where $j$is the label of the special item. For example, the test might establishwhether $i$ is the solution of some hard computationalproblem\footnote{That is, an ``{\sc np}'' problem for which finding asolution is hard, but testing a proposed solution is easy.}.The methodbegins by placing a single quantum register in a superpositionof all computational states, as in the period-finding algorithm(eq. (\ref{step1})). Define  \beq\ket{ \Psi (\theta ) } \equiv  \sin \theta \ket{j} + \frac{ \cos \theta }{\sqrt{N-1}} \sum_{i \ne j} \ket{i}  \eeqwhere $j$ is the label of the element $t = x_j$ to be found.The initially prepared state is an equally-weighted superposition, $\ket{ \Psi (\theta_0 ) }$ where $\sin \theta_0 = 1 / \sqrt{N}$. Now apply $S$, which reverses the sign of theone special element of the superposition, then Fourier transform, change thesign of all components except $\ket{0}$, and Fourier transformback again. These operations represent a subtle interferenceeffect which achieves the following transformation:  \beqU_G \ket{ \theta } = \ket{ \Psi (\theta + \phi ) }  \label{UG}  \eeqwhere $\sin \phi = 2 \sqrt{N-1} / N$. The coefficient of the special elementis now slightly larger than that of all the other elements.The method proceeds simply by applying $U_G$ $m$ times, where $m \simeq (\pi/4) \sqrt{N}$. The slow rotation brings $\theta$ very close to $\pi/2$,so the quantum state becomes almost precisely equal to $\ket{j}$. After the $m$ iterations the state is measured and the value $j$ obtained (with error probability $O(1/N)$).If $U_G$ is applied too many times, the success probability diminishes,so it is important to know $m$, which was deduced by Boyer {\em et. al.} (1996). Kristen Fuchs compares the technique to cooking a souffl\'e. The state is placed in the `quantum oven' and the desired answer rises slowly. You must open the oven at the right time, neither too soon not too late, to guarantee success. Otherwise the souffl\'e will fall---the state collapses to the wrong answer.The two algorithms I have presented are the easiest to describe, and illustrate many of the methods of quantum computation. However, just what further methods may exist is an open question. Kitaev (1996) has shown how to solve the factorisation and related problems using a technique fundamentally different from Shor's. His ideas have some similarities to Grover's. Kitaev's method is helpfully clarified by Jozsa (1997b) who also brings out the common features of several quantum algorithms based on Fourier transforms. The quantum programmer's toolbox is thus slowly growing. It seems safe to predict, however, that the class of problems for which quantum computers out-perform classical ones is a special and therefore small class. On the other hand, any problem for which finding solutions is hard, but testing a candidate solution is easy, can at last resort be solved by an exhaustive search, and here Grover's algorithm may prove very useful. 