QIP 2015 business meeting

A QIP 2015 epilogue: our notes from the business meeting. See also this post by Kaushik Seshadreesan.

Business Meeting Report

local organizing committee report

$193,545 – $191,467 = $2,478 profit!
registration income: $185,340
refunds, about $3,000
external sponsorships: $30,450, and another $5k due later
total income before tax: $212,900
after tax: $193,545
tutorial: $5,296
main program: $47,941
banquet: $120*270 = $32,400
admin: $10k
travel support for about 41 junior researchers: $34k+
invited speakers: $45k estimated
rump session: $10,600 estimated
best student paper prize: $700
other/misc: $5k
total: $191,467
total: 276
in 2014: 261 (early before 31 oct, 169; standard by 30 nov, 68; late by 31 dec, 29)
in 2015: 15 (on-site 10)
no-show: 10
It’s great that the budget was balanced to about 1%. However, what to do about the little bit of extra money? This is a perpetual problem. Runyao had a nice simple idea: just send it to next year’s QIP and use it for travel support for junior researchers.

Program Committee Report:

197 talk-or-poster submissions (1 withdrawn) (In Barcelona, there were 222 but this decrease probably reflects the distance to Sydney rather than anything about the field.)
3 PC members for each submission, and 25 submissions per PC member.
3 weeks of refereeing, 2 weeks of discussion.
Much faster than a typical theoretical CS conference
39 accepts, including 2 mergers 20% accept
SC invited 3 more speakers 40 talks in the program
6 of these recommended to SC for plenary status
one best student paper
There were 601 reviews, at least 3 per submission
There were 142 external reviewers and 220 external reviews.
In the first round there were 102 posters accepted. 5 poster-only submissions, all rejected talk-or-poster submissions
92 more posters 90 accepted… one out of scope and one wrong.
About 40 people withdrew their posters or simply didn’t put up a poster.
We could have accepted about 20-30 more good papers. Future choice: accept more papers? This implies parallel sessions (if we decide to accept all of those good-enough-for-QIP papers). There are pros and cons of this. Pro: more people will be happy, and better representation of research. The Con is that the community will be more split, the conference needs two medium-size lecture rooms (but what about the plenary talks?).
Anecdotal feedback from authors: some reviews were sloppy. On the other hand, with only 3 weeks of refereeing we cannot expect too much. CS reviewers are more detailed and more critical.
Do we want the 3-page abstract format? There was not much discussion on this point, but Ronald de Wolf said that the costs outweigh the benefits in his opinion. We don’t have strong opinions. Steve likes them but thinks maybe two pages would be enough. Aram thinks we could do without them, or could go to 4 pages so the physicists could use their PRL and the computer scientists could use the first 4 pages of their long version. Apparently no one in the audience had strong opinions on this either, since there was no discussion of this point. Hopefully the next PC chair at least thinks carefully about this instead of going with the default.
Do we want to have the abstracts on the website? Again, there was no discussion of this point, but RdW thinks this is generally a good idea (and us Pontiffs agree with him).
Should we make the reviews public (subject to caveats)? E.g., something like what was done for TQC 2014, where lightly edited reviews were posted on SciRate. The answer is obviously yes. 🙂 We made a case for partial open reviewing, and the slides are here. The “partial” here is important. I think a lot of people have misinterpreted our proposal and counter-proposed a compromise in which only edited summaries of reviews of reports are posted for accepted papers; this is funny because it is essentially what we did for TQC 2014! It is true that in implementing this the details are extremely important, including instructions to the PC & subreviewers and the explanations of the system to authors and the public (e.g. crucially including the fact that published reviews are not meant to explain acceptance/rejection or even to label as “good” or “bad” but rather to add perspective). Probably these points should be in a longer post.
QIP 2016 will be held in Banff, with Barry Sanders chairing the local organizing committee.
Bids for QIP 2017 are being put in by Zürich and Seattle with local organizing committee chairs of Rotem Arnon Friedman and Krysta Svore respectively. (I mention chairs because my understanding is that no matter how large the committee is, they do a \Omega(1) fraction, or even a 1-o(1) fraction, of the total work.) A straw poll of attendees shows slight favor for Zürich. Krysta said that MSR would probably still be interested in hosting in 2018, when the geographic case for Seattle would be stronger. Neither place will be as glorious as Sydney in January, but Seattle winters are pretty mild (although gray).
Stephanie Wehner presented the results of a poll that showed support for parallel sessions (about 50% of respondents favored this over options like dropping plenary talks, dropping the free afternoon, shorter talks or doing nothing). Others, like Daniel Gottesman, complained that the poll seemed to be asking how to increase the number of talks, rather than whether we should. A show of hands at this point (from an audience that by now had become pretty small, perhaps in part because there was free beer at the rump session at this point) showed an audience roughly evenly divided between supporting and opposing an increase in the number of talks. The Trinity of Pontiffs are divided on the parallel question, but of course it doesn’t have to be all or nothing. We might try an experiment doing parallel talks on one day (or even half-day) out of five, so we can see how we like it.

QIP 2015 zombie-blogging, Day 5

Today’s highlight: an algorithm running in time O(n^{59}), also known as “polynomial time” or “efficient”.

Joseph Fitzsimons and Thomas Vidick.
A multiprover interactive proof system for the local Hamiltonian problem(Plenary Talk)
abstract arXiv:1409.0260

Thomas begins by reminding everyone about what NP is and mentions that Super Mario is NP-complete. However, I solved it as a child and saved the princess from Bowser. I’m not sure what the implications of this are for P vs. NP. Knowing that Super Mario is in NP saves me from having to learn it. I just reduce it to 3-SAT.
All problems in NP have a local verification procedure because of the NP-completeness of 3-SAT. But we still need to know the whole proof, right?
There is a nice way to make proof verification completely local, and that is to introduce more than one prover. We introduce a familiar cast of characters: Merlin and Arthur. Arthur is the referee and is allowed to ask two non-communicating Merlins to prove something to him. The value is defined to be \omega(G) = \sup_{\textrm{Merlins}} \textrm{Pr}[\text{Arthur accepts}]. We need to ensure that this scheme is both sound and complete. A stronger version known as the PCP theorem has the following interesting property. Arthur can do some pre-processing and then just check a constant number of clauses to get a high probability of soundness.
The 3-local Hamiltonian problem is a well-known complete problem for QMA. Here one is given a local Hamiltonian on n qubits with a promise that the ground state energy is either less than a or greater than b, with a-b \ge 1/\mathrm{poly}(n) (the “promise gap”), and one must decide which is the case.
In the quantum setting, we allow the Merlins to share initial entanglement, but they can’t communicate after that. Now the value is denoted by \omega^*(G) = \sup_{\textrm{Merlins}} \textrm{Pr}[\text{Arthur accepts}]. The star denotes this additional quantum entanglement shared by the Merlins.
Can Arthur use the entangled Merlins to his advantage, to recognize languages beyond NP? The Peres-Mermin “magic square” game makes this unclear since at least in some cases the Merlins can “cheat” and use entanglement to increase the acceptance probability. But it was shown [KKMTV 08, IKM 09] that Arthur can still use entangled Merlins to at least recognize languages in NP. An aside: this, like most previous work, viewed the entanglement between the Merlins mostly as a drawback to be mitigated using techniques like monogamy relations that force the provers to use strategies that resemble unentangled strategies.
To illustrate the difficulties, suppose we have a 1D Hamiltonian with nearest-neighbor interactions. Suppose that these are anti-ferromagnetic so that the unique ground state of each two-qubit term is the singlet, which we say has zero energy. This is of course highly frustrated and the ground-state energy will be proportional to the number of qubits. But a naive two-prover proof system would allow us to be convinced that the ground-state energy is zero. Suppose we can split the qubits into even and odd layers that are mutually commuting. We can have Merlin-1 take the odd qubits and Merlin-2 take the even qubits. We choose a random interaction, say on sites j and j+1, and ask Merlin-1 for one of them and Merlin-2 for the other. But this doesn’t work. The Merlins need only share a singlet state which they just return regardless of which question we ask.
The main result is a five-player game for the 3-local Hamiltonian problem. The messages from the Merlins to Arthur are quantum, but very short. The value of the game with n classical questions, 3 answer qubits, and with 5 players is QMA-hard to compute to within a 1/\mathrm{poly}(n) factor. Consider the ground state of the Hamiltonian encoded in the 5-qubit code. We will require the five Merlins to each have one share of these encoded qubits, so each Merlin has n qubits.
The protocol is as follows. Pick a random clause H_l on qubits i,j,k and either:

  1. energy check
    1. ask each Merlin for his share of i,j,k
    2. decode
    3. measure H_l
  2. code check
    1. ask one Merlin for his share of i,j,k
    2. ask other Merlins for their share of i
    3. verify that qubits lie in code subspace.

The intuition is that the Merlins are forced to be in a code space, with the states of 4 Merlin constraining the fifth. How is this proven?
The most general Merlin strategy is to share a state |\psi\rangle and to respond to a request for qubit i by applying a unitary U_i, or to a request for i,j,k with the unitary V_{i,j,k}. We would like to argue that any such strategy can be turned into a method for extracting all n qubits from the state |\psi\rangle.
This will be done using a method similar to a mosquito drinking blood: as it extracts blood it replaces the volume of liquid with its saliva. Here we extract a qubit i using U_i (actually U_i^{(1)}, \ldots, U_i^{(5)}), and then replace those qubits with halves of EPR pairs and then each prover applies U_i^\dagger. Actually, the other halves of the EPR pairs are never used so even a maximally mixed state would work. The point is just to put something there, effectively performing a logical SWAP.
This work also leads to a natural reformulation of the quantum PCP conjecture: Constant-factor approximations to the entangled value of a quantum game with entangled provers \omega^*(G) are QMA hard. The result is a first step in this direction by solving the case of a 1/\mathrm{poly}(n) factor.
Another consequence is for MIP and related complexity classes. MIP(c,s) refers to the class of problems with a multi-prover interactive proof with completeness c and soundness s. In this language the PCP theorem implies that NEXP=MIP(1,1/2).
In the quantum case Thomas proved that NEXP \subseteq (Q)MIP^*(1,1/2) in an earlier breakthrough. This work shows now that QMA_{EXP} is contained in MIP*(1-2^{-p}, 1-2^{-2p}), proving for the first time that entanglement increases the power of quantum proof systems. Here “proving” is in the usual complexity-theory sense, where we have to make some plausible assumption: in this case, that \text{QMA}_{\text{EXP}} \not\subseteq \text{NEXP}.
During the questions, Ronald de Wolf and Daniel Gottesman pointed out that you might be able to reduce it from 5 Merlins to 4 by using error-detecting codes, or even 3 by using qutrit quantum error-detecting codes. Or what about 2 using approximate QECC? (This last probably won’t work.)

Sergey Bravyi and Matthew Hastings. On complexity of the quantum Ising model
abstract arXiv:1410.0703

(This talk was presented by the heroic David Gosset because Sergey didn’t get a visa in time.)
The transverse Ising model (TIM) is important for several reasons. For one thing, this model is ubiquitous in the theory of phase transitions. It’s a good model of certain non-universal quantum devices like the D-wave machine. And the recent breakthrough of Cubitt and Montenaro shows that the TIM appears naturally as a possible intermediate complexity class.
We would love to understand the quantum annealing (QA) algorithm of Farhi et al., and unfortunately we won’t be able to do that here. But we can use it to help us understand a target-simulator model that lets us do reductions of various Hamiltonian complexity problems. An example, if we have a simulator QA machine that has TIM Hamiltonians, then it is unlikely that we can simulate a target QA machine that has access to 2-local Hamiltonians (which are BQP complete). The simulator TIM machine is contained in BQP \cap postBPP, which is unlikely to equal BQP.
Recall the class of “stoquastic” local Hamiltonians. These are local Hamiltonians with “no sign problem”, meaning that all off-diagonal matrix elements in the computational basis are real and non-positive. There is a natural complexity class, StoqMA, that captures the complexity of these Hamiltonians.
StoqQMA is like QMA but Arthur can only apply reversible classical gates (CNOT, Toffoli) and measure some fixed qubit in the X basis. He accepts iff the measurement outcome is +. He can use 0 and + ancillas.
StoqMA’s relation to other complexity classes:

  • P \subseteq NP \subseteq MA \subseteq StoqMA \subseteq QMA \subseteq A_0PP
  • StoqMA \subseteq SBP \subseteq PostBPP
  • SBP \subseteq AM \subseteq \Pi_2

(A_0PP and SBP are approximate counting classes.)
Main result: The local Hamiltonian problem for TIM on degree-3 graphs is StoqMA-complete. This sharpens the Cubitt and Montenaro classification by linking the TIM directly to StoqMA.
In the ferromagnetic TIM, the coupling terms are all positive and the Z-field is uniform. Another result is a polynomial-time approximation algorithm for the partition function of the ferromagnetic TIM. This generalizes and in fact makes use of a result of Jerrum & Sinclair from 1993. The run time is polynomial: O(n^{59}). Awesome. Taking the log of the partition function, one gets the free energy within an additive constant, and at low temperature, this approximates the ground state.

Rafael Chaves, Christian Majenz, Lukas Luft, Thiago O. Maciel, Dominik Janzing, Bernhard Schölkopf and David Gross.
Information-Theoretic Implications of Classical and Quantum Causal Structures
abstract arXiv:1407.3800

Given some empirically observable variables, which correlations between these are compatible with a presumed causal structure? This is a fundamental problem in science, as illustrated by the following situation. You would like to claim that smoking causes cancer. But all you have is correlation data… and correlation doesn’t imply causation. So when can we make that inference?
One of the main ideas of this work is to use entropies rather than probabilities in order to avoid headaches associated with nonconvex structure that appears in previous approaches to answering these types of questions.
Classically, Directed Acyclic Graphs (DAG) have edges encoding a causal relationship between the nodes that they connect. Conditional Independences (CIs) are encoded by a DAG; this is the causal structure. A given probability distribution is compatible with a given causal structure if it fulfills all of the CI constraints implied by a given DAG.
Marginal Scenarios: Usually, for a variety of reasons, not all variables in a DAG are observable.
Probabilities compatible with a causal structure are expressible by a polytope, within which they must fall, e.g. Bell’s theorem. However, in an example due to Geiger and Meek, which is symmetric and consisting of three unseen variables each causing two of the three observable events, A, B, and C, we have a geometry that is non-convex.
Classically, going to the entropic description, we get a description of marginal causal entropic cones in terms of the entropic Bell inequalities framework pioneered by Braunstein & Caves in 1988.
A variant of this is the common ancestor problem.
Quantumly, there is not a unique definition of what the causal structure is. What should we use? Informally it should specify the dependencies between a collection of quantum states and classical variables. We use 3 kinds of nodes,

  • Classical
  • Quantum states
  • Quantum operations, i.e. CPTP maps

Because measurement affects a quantum system, some CIs that are classically valid cannot be defined in the quantum case. But independencies still hold and we can use data processing inequality. One consequence is a strengthening of the IC inequality, and allows it to be generalized eg to quantum messages. (see also 1210.1943).

Nicolas Delfosse, Jacob Bian, Philippe Guerin and Robert Raussendorf.
Wigner function negativity and contextuality in quantum computation on rebits
abstract arXiv:1409.5170

What makes quantum computing work? Many explanations have been proffered in the past: entanglement, contextuality, superposition and interference, and even quantum discord. While it seems fair to say that we really have no idea what makes a quantum computer “work” so well, we do have some ideas for some specific models.
The main result of this talk is that contextuality is a necessary resource for universal quantum computation with magic states on rebits. Pro: these are two-level systems; Con: these are not qubits, only “rebits”.
Mermin’s square and Mermin’s star are state-independent proofs of contextuality. They look as if they will spoil the party. But in fact they help.
Previous work by Anders and Browne ’09 showed that contextuality powers measurement based quantum computation, and in 2014 Howard et al. showed that contextuality powers quantum computation with magic states.
In the setting of quantum computation by state injection, we have a restricted family of gates (e.g. Clifford gates) and we have some noisy version of a state that cannot be prepared from that gate set and starting with computational basis states. Which properties must a fiducial state possess in order to “promote” a restricted model of Clifford gates to a universal quantum computer? In work by Joe Emerson’s group, this was answered for odd-prime dimensional qudits that we need:

  • Wigner function negativity
  • Contextuality

Hudson’s theorem characterizes the set of pure states with non-negative Wigner functions: it is precisely the set of state that are Gaussian, aka stabilizer states in the discrete setting. Clifford operations cannot introduce (Wigner) negativity through gate operations, and so that makes negativity a resource for initial magic states. All of this works out very cleanly in the case of odd-prime dimensions.
Mermin’s magic square implies that not all contextuality can be attributed to states. This seems to ruin our “resource” perspective. Not all contextuality that’s present can be attributed to states. What’s worse, Mermin’s square yields a contextuality witness that classifies all 2-qubit quantum states as contextual.
To deal with this, we move to rebits, so that the density matrix \rho is restricted to be real with respect to the computational basis for all times. We also have to use a restricted gate set that is CSS-preserving, and purely Clifford. This also affects the state preparations and measurements that are allowed, but let’s skip the details.
Now there is a d=2 Hudson’s theorem: A pure n-rebit state has a non-negative Wigner function if and only if it is a CSS stabilizer state.
Wigner non-negativity then implies that Pauli measurements on \rho are described by a non-contextual hidden variable model. Equivalently contextuality implies negativity of the Wigner function, in our rebit/CSS setting.
There is no contradiction with the Mermin magic square because of the rebit property and the CSS restriction: we cannot measure ZX and XZ simultaneously. Only all-X and all-Z Pauli operators can be measured simultaneously.

Ramis Movassagh and Peter Shor.
Power law violation of the area law in quantum spin chains
abstract arXiv:1408.1657

We’ve already discussed the area law and its important implications for the efficient simulation of gapped 1-dimensional spin systems. We believe that an area law holds for higher-dimensional spin systems if the Hamiltonian is gapped. We also know that 1-dimensional spin-chains can in general be QMA-complete, but the spectral gap shrinks in the hard cases.
These authors have a series of papers on this larger project.
On a spin chain with qudits and random interactions that are projectors of rank r, they showed that

  • The ground state is frustration free but entangled when d \leq r \leq d^2/4.
  • Schmidt ranks were known, but gap/entropy weren’t known.

Then Irani and Gottesman-Hastings used the type of Hamiltonians from 1-d QMA-hardness constructions to obtain Hamiltionians with 1/poly(n) gap and O(n) entropy. The local dimension is “O(1)” but in a CS rather than physics sense (i.e. the constants are big). Some condensed matter theorists have dismissed these Hamiltonians as “fine-tuned.”
Previous work by these authors had frustration-free, 1/poly(n) gap and O(log n) entanglement, but this could still be explained away as being “critical” since this entropy scaling matched what one expected from conformal field theory.
The latest results, with entanglement entropy O(\sqrt{n}) and the same guarantees on the spectral gap, do not match any of the condensed matter theorists’ explanations. They only require spins of dimension 5, so they are much closer to a natural model.
The “Motzkin state” is the ground state in question here that they construct, as they use something called Motzkin paths to construct this Hamiltonian. The ground state of the original construction was a superposition of all Motzkin walks on a chain of length 2n.
We say a Motzkin state is the superposition of all Motzkin walks. A Motzkin walk is a walk that starts at 0, ends at 0, remains nonnegative, and goes up or down or remains level at each intervening step. Our Mozkin walks can have two kinds of up step and two kinds of down step. Our Hamiltonian introduces a penalty for violating the Motzkin rule.
The reason that these states can lead to entanglement is that the amount that it goes “up” on the left half of a cut must equal the amount that it goes “down” in the right half.
Combinatorially we know how many Motzkin walks there are of height m, number of kinds of step s, and length n. Turning the sum into an integral and doing saddle point integration we get the entanglement.
One can implement the constraints of a colored Motzkin walk with local constraints, and these become the terms in the local Hamiltonian. You can get an upper bound on the spectral gap using the variational principle. The lower bound can be obtained using similar techniques as the previous results by the authors and others.
Is there a continuum limit for these models? Can we rigorously prove these results with an external magnetic field that can eliminate the need for boundary conditions? Are there frustration-free Hamiltonians with unique ground states and no boundary conditions that violate the area law by large factors?

Hector Bombin.
Gauge color codes and Single-shot fault-tolerant quantum error correction (Plenary Talk)
abstract-89 arXiv:1311.0879 abstract-90 arXiv:1404.5504

Fault-tolerant gates are great, but they can’t be done in a transversal way and still be universal. This is the celebrated Eastin-Knill theorem. In the context of topological quantum error-correcting codes, there is a natural relaxation of the notion of transversal gates to finite-depth quantum circuits [Bravyi-Koenig’09].
Color codes are an interesting class of topological stabilizer codes that allows for a transversal implementation of a T gate in three dimensions. It generally saturates the Bravyi-Koenig bound on the Clifford hierarchy for transversality. Hector has generalized his color codes to a subsystem code version. The recent important paper by Paetznick and Reichardt introduced the notion of gauge fixing that lets us jump between different codes with different transversality restrictions, and this let’s us sidestep the Eastin-Knill theorem. The new gauge color codes can be combined with this gauge-fixing idea to move between conventional color codes and gauge color codes. In these two codes, there are two different sets of operations that together are universal.
In typical fault-tolerant methods, we make noisy syndrome measurements and we repeat them several times to avoid errors. The decoding step is a global classical calculation and the correction is a transversal quantum operation. Hector’s new paradigm of single-shot fault tolerance is a way to avoid the multiple rounds requirement in the old method.
3D color codes turn out to be single-shot fault tolerant (SSFT). This is because the gauge degrees of freedom have some redundancy and this can be used to make inferences about which stabilizer measurements might have been faulty. The notion of SSFT is closely linked to the notion of self-correction via the physical mechanism of confinement. As a simple example, consider the ferromagnetic Ising model in a symmetry-broken state below the critical temperature. Anti-aligned magnetic domains are confined in this phase. The following week Poulin gave a talk at Coogee that was skeptical about the possibility of single-shot fault tolerance. Definitely this notion needs to be made more precise.
Suppose we want to do fault-tolerant error correction in a gauge color code. A faulty gauge syndrome will be one with endpoints, and we can repair the gauge syndrome, with the branching points of the result giving the new syndrome. Importantly, the gauge syndrome is unconfined; it is random except for the fixed branching points. The effective wrong part of the gauge syndrome, however, is confined. Each connected component has branch points with neutral charge. Therefore the branching points exhibit charge confinement. This sounds intriguing, but none of the Pontiffs really understand the details.
Gauge fixing to either Z or X with give you either a transversal implementation of either T or HTH, and this lets you perform arbitrary gates on the encoded logical qubit.
Bonus Result: 3d quantum computation, with local operations and constant time overhead, but global classical computation. This uses the 3d color code for computation with a stack of 2d codes for memory; see arXiv:1412.5079 for details.

Courtney Brell.
Self-correcting stabilizer quantum memories in 3 dimensions or (slightly) less
abstract arXiv:1411.7046

We’ve previously blogged about this very interesting result here. But now there’s a problem! The thermodynamic considerations all still seem to hold, and the Hausdorff dimension of the code is still 3 or less, but the specific embedding theorem that Courtney had used previously doesn’t not apply. Therefore, it is still open if this code can be embedded in 3 dimensions with constant density. Courtney is currently working to fix the proof, but for now the embeddability of these codes is downgraded to a conjecture.
Courtney also gave a heuristic argument for why this embeddability conjecture might be true. In the limit of low lacunarity (which is basically a measure of how much a fractal violates translation invariance) there is a simple condition that says that the density of a projection stays bounded.
Two interesting tools that Courtney uses for the thermodynamic arguments that might have broader interest are the GKS inequality, which says that adding ferromagnetic interactions cannot reduce ferromagnetic order, and Merlini-Gruber duality, which is a duality similar to the Kramers-Wannier duality used to prove the phase transition in the Ising model.

Henrik Wilming, Rodrigo Gallego and Jens Eisert.
Universal operations in resource theories and local thermodynamics
abstract arXiv:1411.3754

This talk takes a resource-theory approach to thermodynamics. The goal is to extract work from a state, given various types of permitted free operations and a possibly limited ability to change the Hamiltonian by investing work. For each class of operations, the Gibbs state is a “free” state, analogous to product states in entanglement theory. The options are

  • weak thermal contact: Bringing the system in contact with the heat bath puts it into thermal equilibrium.
  • thermal operations: More generally are any energy-conserving unitary operations on the heat bath and the system. A much larger set of operations than weak thermal contact.
  • A still larger class of maps comprises all quantum channels that have the Gibbs state as a fixed point: call these Gibbs-preserving (GP) maps. GP maps can sometimes take a point below the thermal curve to a little bit above, without violating the 2nd law.

How effective these different models are at extracting work depends on the class of Hamiltonians allowed. If any Hamiltonians are allowed then there is a collapse and weak thermal contact can do as well as GP maps (and of course also thermal operations), essentially extracting all surplus free energy of a state. If we restrict the class of possible Hamiltonians then separations between these models are possible, in part because it’s harder to deal efficiently with off-diagonal terms in the density matrix.

QIP 2015 dead-blogging, Day 4

Warning for typical QIP attendees —
This first talk may have some implications for the real world 🙂

Andrea Mari, Vittorio Giovannetti, Alexander S. Holevo, R. Garcia-Patron and N. J. Cerf.
Majorization and entropy at the output of bosonic Gaussian channels.
Previous Title: Quantum state majorization at the output of bosonic Gaussian channels (Plenary Talk)
abstract arXiv:1312.3545

The solution of the Gaussian optimizer conjecture by Giovannetti, Holevo, and Garcia-Patron has led to enormous progress on a web of interconnected connected conjectures about Gaussian channels. This includes the phase space majorization conjecture, the additivity and minimum output Renyi entropies conjectures, strong-converse theorems for the classical capacity of gaussian channels, the entropy-power inequality conjecture, and more! The only remaining open questions are the entropy-photon number inequality and the quantum capacity.
The proof uses a decomposition of Gaussian channels. Every phase-insensitive Gaussian channel is equivalent to a quantum-limited attenuator followed by a quantum-limited amplifier. Every phase-contravariant Gaussian channel is equivalent to a quantum-limited attenuator followed by a quantum-limited amplifier followed by a noiseless phase conjugation.
It also uses an old result about quantum beam splitters. How can the output of a beamsplitter be distinguished from two independent broadcasts of the same source? Answer: only if the broadcast was a coherent state. Therefore we have the “golden property” of a quantum beam splitter: the only input states producing pure output states for a quantum limited attenuator are coherent states.
The main result of this talk is a proof of the majorization conjecture. This addresses the question: what is the minimum noise or disorder achievable at the output state, optimizing over all possible input states? The output from a coherent input state majorizes all other output states. Namely, if \Phi is a Gaussian channel, then
\Phi(\lvert\alpha\rangle\!\langle\alpha\rvert) \succ \Phi(\rho) for all \rho,  \lvert\alpha\rangle.
The proof uses the decomposition results mentioned above, concavity of entropy, and the breakthrough results of Giovannetti, Holevo, and Garcia-Patron. A similar result holds for any Renyi p entropy for p > 1. There is also a phase space majorization result.
Here is an awesome summary of what this research project has accomplished.

Runyao Duan and Andreas Winter. No-Signalling Assisted Zero-Error Capacity of Quantum Channels and an Information Theoretic Interpretation of the Lovasz Number. Previous title: Zero-Error Classical Channel Capacity and Simulation Cost Assisted by Quantum No-Signalling Correlations
abstract arXiv:1409.3426

Coming the morning after the conference dinner, Andreas beings with a “hangover summary” of the main results:

  1. C_0(G) \le \log \vartheta(G) (sometimes known to be strict)
  2. C_{0,E}(G) \leq \log \vartheta(G)
  3. C_{0,NS}(G) = \log \vartheta(G)

On the LHS we have the asymptotic zero-error capacity of a channel, assisted either by nothing (C_0), entanglement (C_{0,E}) or no-signalling correlations (C_{0,NS}). On the RHS we have the Lovasz theta number (see below) which can be calculated efficiently using semidefinite programming.
Classical zero-error capacity depends only on the transition graph, not the probabilities. A useful notion is that of confusability graph, which is an undirected graph on the input symbols of the channel where two symbols are confusable if there is a nonzero probability of confusing them. This encodes the combinatorics of the channel in a natural way. For example, product channels have a product confusability graph whose adjacency matrix is the tensor product of the confusability graphs of the factor channels. (More precisely I+ A(N\times N') = (I+A(N)) \otimes (I + A(N')).)
The largest codebook for this channel is given by the independence number of the confusability graph; this gives the optimal zero-error capacity. It’s unfortunately NP-complete to compute this property of a graph, and even approximating it is NP hard. However, a seminal result due to Lovasz gives a convex relaxation of this into a semidefinite program, the Lovasz theta number.
\alpha(G) \leq \vartheta(G) = \max \mathrm{Tr}(BJ)  \mbox{ s.t. } B\succeq 0,\ \mathrm{Tr} B=1,\ B_{xy} = 0 \mbox{ for all } x,y \in G
Another useful bound is the so-called fractional packing number, obtained by writing down the natural integer program for independence number and replacing the w_x\in \{0,1\} constraint with 0\leq w_x \leq 1, then observing that the upper bound is superlative:
\leq \alpha^*(\Gamma) = \max_x w_x \mbox{ s.t. } w_x \geq 0, \mbox{ and } \forall y \sum_x \Gamma(y|x) w_x \leq 1.
Minimizing over \Gamma consistent with G gives \alpha^*(G) = \min_{\Gamma\sim G} \alpha^*(\Gamma).
Crucially, both the Lovasz theta number and the fractional packing number are multiplicative, and therefore they bound not only the independence number, but also the capacity. Andreas said something surprising: If f(G) is multiplicative and satisfies \alpha(G) \leq f(G) \leq \vartheta(G), then we must have f(G) = \vartheta(G)! This is unpublished still, but follows from the beautiful formula [personal communication from Winter, via my memory, so hopefully not too wrong]:
\vartheta(G) = \sup_H \frac{\alpha(G\times H)}{\vartheta(H)}.
Compare with a similar formula for the fractional packing number:
\alpha^*(G) = \sup_H \frac{\alpha(G\times H)}{\alpha(H)}.
An example, in fact the usual example, is the “typewriter graph”. This has an input x\in\mathbb{Z}_5 and output y \in \{x,x+1\}. Then \alpha(G) = 2, \alpha(G\times G) =5 > \alpha(G)^2, \vartheta(G) = \sqrt{5} (thus giving a tight upper bound on the capacity) and \alpha^*(G) = 5/2.
We would like to close the gap provided by the above relaxations by allowing additional resources in the encoding and decoding. Shannon did this in his 1956 paper by allowing feedback, and we can also add entanglement and no-signaling correlations.
The Lovasz theta number is in fact an upper bound to the entanglement-assisted zero-error capacity, which sometimes is larger than the unassisted zero-error capacity. This gives an operational (but partial) explanation of why the Lovasz theta function is not always a tight upper bound.
On the other hand, the fractional packing number is equal to the non-signalling-assisted capacity, which is an awesome and nontrivial result.
Now what about the quantum case? Here things get even cooler, although not everything is as well understood. For quantum channels, the quantum generalizations of the transition and confusability graphs are given by the following correspondences:
K = \text{span}\{E_i\} and S = K^\dag K = \text{span} \{E_i^\dag E_j\}.
The authors have also also generalized (as in Will Matthews’ talk earlier) the notion of no-signalling operations to ones that take quantum inputs and outputs. Just as the classical no-signalling polytope is a small linear program, this quantum set has a nice SDP characterization.
Now consider \Upsilon(K) = \max \sum_x s_X \mbox{ s.t. } 0 \leq R_x \leq s_x \Pi_x^\perp\sum_x (R_x + s_x \Pi_x) = I. This is
\leq A(K) = \max \sum_x s_x \mbox{ s.t. } 0 \leq s_x, \sum_x s_x \Pi_x \leq I.
Then the “most amazing thing”, according to Andreas, is the following theorem:
C_{0,NS}(K) = \lim \frac{1}{n} \log \Upsilon(K^{\otimes n}) = \log A(K).
Note that this is not just an inequality, but an actual equality!
They show actually that \Upsilon(K^{\otimes n}) \ge n^{-c} A(K)^n.
The proof is reminiscent of the recent and inexplicably not-invited Fawzi-Renner breakthrough on the quantum conditional mutual information.
Now in terms of G, instead of K, \min A(K) = \vartheta(G), which is the first information-theoretic application of the Lovasz theta function.
Some of the closing remarks were that

  • SDP formulas for assisted capacity and simulation cost (one shot)
  • SDPs can regularize to relaxed SDPs!
  • Capacity interpretation of Lovasz theta number
  • Is there a gap between C_{0,E}(G) and \log\vartheta(G)?
  • Is regularization necessary?

Mario Berta, Omar Fawzi and Volkher Scholz.
Quantum-proof randomness extractors via operator space theory;
abstract arXiv:1409.3563

This talk is about producing high-quality (i.e. nearly uniform) randomness from [necessarily longer] strings that contain “lower-quality” randomness. Tools for doing this are called extractors; other closely related tools are condensers and expanders. Why would we need to do this? Well suppose we start with some uniform randomness and then leak some information about it to an adversary (Eve). Then Eve’s subjective view of our randomness is no longer uniform. Applying an extractor can make it again close to uniform. Even in a non-cryptographic setting they can be useful. Suppose we run a randomized algorithm using some large uniformly random seed. Conditioned on the output of this algorithm, the seed will no longer be uniformly random. If we want to continue using it, maybe even for later steps of a larger algorithm, we will need to clean it up using something like an extractor.
This establishes the need for extractors and their crucial property: their goal is to make a string random conditioned on some other string (e.g. Eve’s knowledge, or the earlier outputs of the randomized algorithm; call this the “side information”). This talk will consider the case when side information is in fact quantum. Conditioning on quantum things is always tricky, and one hint that the math here will be nontrivial is the fact that “operator space theory” is in the title.
One unpleasant fact about extractors is that they cannot be deterministic (this is a useful exercise for the reader). So they usually use some extra input, guaranteed to be uniform and independent of the source. The smallest possible seed size is logarithmic in the input size. Also the number of output bits can be (more or less) no larger than the min-entropy of the source. So the key parameters are the input size, the input min-entropy, the input seed size, the number of output bits and their trace distance from being uniform and independent of the side information.
One example of a nice classical result is the leftover hash lemma.
Can we still get the same result if adversary is quantum? This will have implications for privacy amplification in q crypto and also (not sure why about this one) properties of q memory.
Here min-entropy becomes the conditional min-entropy, which involves a maximization over all guessing strategies and measures knowledge of an adversary with access to a q system correlated with the source.
The main result here is a mathematical framework to study this, based on operator space theory. Why operator space? We can define C(Ext, k) to be the maximum advantage for a classical adversary against the extractor Ext on sources with k bits of min-entropy. The key insight is that this is a norm, specifically an operator norm. Sources with \geq k bits of min-entropy correspond to probability distributions (i.e. \|x\|_1 \leq 1) with all entries \leq 2^{-k}, (i.e. \|x\|_\infty \leq 2^{-k}). The intersection of these two constraints defines a centrally symmetric convex set (if we subtract off the uniform distribution, I suppose), which can then define a norm. One other example of such a hybrid norm is the family of Sobolev norms. Now the achievable bias is like the maximum 1-norm of the output of such a vector when we act on it with the extractor Ext, which we can think of a linear map. So this is an operator norm, i.e. the norm of Ext is the max of the norm of Ext(x) divided by the norm of x, where we measure the numerator in the l_1 norm and the denominator in this funny hybrid norm.
The advantage of a quantum adversary is Q(Ext, k) which is like the cb (completely bounded) version of the above. Basically suppose that instead of the entries of x being numbers they are matrices. This is kind of like what team Madrid did with nonlocal games.
They also define an SDP relaxation which has the advantage of being efficiently computable.
It follows trivially from the definitions that
C(Ext,k) \leq Q(Ext,k) \leq SDP (Ext, k).
The nontrivial stuff comes from the fact that the SDP relaxations are in some cases analytically tractable, e.g. showing that small-output and high input entropy extractors are quantum-proof (i.e. secure against quantum adversaries).
A nice open question: Given that there is an SDP, can we define a convergent hierarchy?

Richard Cleve, Debbie Leung, Li Liu and Chunhao Wang.
Near-linear construction of exact unitary 2-designs

The basic idea of a unitary t-design is that it is a discrete set of unitary operators that exactly reproduce the first t moments of the Haar measure, i.e. the uniform measure on the unitary group. This construction lets us sample efficiently and still reproduce important properties of the Haar measure. But we don’t just want constructions with small cardinality, we also want them to have efficient gate complexity.
The speaker jokingly says, “unfortunately, these have a lot of applications.” This includes randomized benchmarking, decoupling and error-tolerant QKD schemes. There is another joke about how decoupling doesn’t mean breaking up (romantic) couples. But if Alice and Bob are entangled, and one passes through a decoupling map, then they end up nearly separable! That sounds like a breakup to me. Note that a weaker randomizing map would merely leave them with a data hiding state, which I suppose corresponds to the facebook status “It’s Complicated.” These jokes are still in focus groups, so please forgive them.
We would like to have unitary designs that have low gate complexity. The main result of this talk is a Clifford-only unitary 2-design that uses O(n \log n \log \log n) gates, but it assumes an extension of the Riemann hypothesis. Without this hypothesis, they also have a non-Clifford construction with the same complexity, and a Clifford-only scheme with complexity O(n \log^2 n \log \log n). Sampling uniformly from the Clifford group has gate complexity O(n^2/\log n), so this is an improvement.
One can construct these 2-designs by writing gates in the form S^{\otimes n} H^{\otimes n} M_r, where the matrix M_r has a decomposition with low complexity that is the main technical contribution of the authors. By using the result that Pauli mixing by conjugation implies unitary 2-design, one only needs to consider certain forms of the matrix M_r. Ronald de Wolf suggested that these results could be improved a bit further by using the fast integer multiplication algorithms due to Fürer.

Carl Miller and Yaoyun Shi.
Universal security for randomness expansion. Previous title: Universal security for quantum contextual devices
abstract arXiv:1411.6608

Carl opens his talk by saying that this result has been a goal of the U of M group for the past four years. It then proceeds in suitably epic fashion by looking up the definition of randomness from The Urban Dictionary (highlight: “A word often misused by morons who don’t know very many other words.”) and moving onto modern cryptographic standards (skipping, though, this gem).
This talk follows the research program laid out by Colbeck in 2006, where he suggested that one might take the output of some CHSH games, verify that the winning probability is higher than the optimal classical value, and then apply a deterministic extractor. Difficulties here involve quantum side information, information locking, and the other nuisances that composable security was invented to address. Vazirani-Vidick showed in 2011 that this was possible if in the honest case the Bell inequality is violated optimally, last QIP Miller and Shi showed it worked with some not-quite-perfect value of the CHSH game and the current result extends this to any beyond-classical value of the game.
One the key technical ingredients is a new uncertainty principle.
We will not fully do it justice here. Roughly it is as follows. Define
Y = \text{tr} [\rho_+^{1+\epsilon} + \rho_-^{1-\epsilon}] / \text{tr}[\rho^{1+\epsilon}] and X = \text{tr} [\rho_0^{1+\epsilon}] / \text{tr} [\rho^{1+\epsilon}] where \{\rho_0,\rho_1\} and \{\rho_+,\rho_-\} are the post-measurement states resulting from a pair of anticommuting measurements on \rho. Given this, the Miller-Shi uncertainty principle states that the pair (X,Y) must fit into a particular region of the plane. (see eq (2.2) of their paper for a bit more detail.)
The proof uses the “uniform convexity of the S_{1+\epsilon} norm” due to Ball, Carlen, & Lieb (see also this note). Carl suggests there are more applications. Fernando Brandao and I [Aram] have one! Stay tuned.
At one point Carl says: “By an inductive argument (which actually takes 20 pages), we see that…”
The result can then be generalized beyond X-Z measurements to any pair of non-commuting measurements.
Now what about totally untrusted devices? These too can be forced, using a VV-like approach, to act in a way as though they are performing non-commutative measurements.
This can be generalized even further to Kochen-Specker inequalties and contextuality games, modulo some mild assumptions on how the device works.
open problems: what resources are used? This uses a linear amount of entanglement, for example. For any experimentalists watching (for whom entanglement is scarce and randomness abundant), this part of the talk must sound funny.

Neil J. Ross and Peter Selinger.
Optimal ancilla-free Clifford+T approximation of z-rotations (Plenary Talk)
abstract arXiv:1403.2975

Solovay-Kitaev was all about geometry, but recently the field of quantum compiling has shifted to algebraic number theory, following the path-breaking 2012 paper by Kliuchnikov, Maslov, and Mosca.
Some number theory: In the ring \mathbb{Z}[\sqrt{2}] define (a+b\sqrt{2})^{\bullet} = a-b\sqrt{2} and the norm N(a) := \sqrt{a^\bullet a}.
Applying the bullet operation (a Galois automorphism of the number field) to a compact interval yields a discrete unbounded set. The 1-d grid problem is, given finite intervals A, B \subset \mathbb{R}, find an element of A\cap B^{\bullet}. Generically there will be O(|A|\cdot |B|) solutions, and these are easy to find when both |A| and |B| are large, corresponding to a fat rectangle (when \mathbb{Z}[2] is viewed as a 2-d lattice). When we have a long thin rectangle we can convert it into this case by rescaling.
The 2-d grid problem: Now we consider \mathbb{Z}[\omega] for \omega=\exp(2\pi i /8). This turns into a more complicated set intersection problem on a 2-d lattice whose details we omit. But just like the rescaling used for 1-d, now we use a more complicated set of shearing operators to transform the problem into one where the solutions are easier to find. But we need an extra property: the “uprightness” of the rectangles.
Main Theorem: Let A and B be two convex sets with non-empty interior. There is is a grid operator G such that both G(A) and G^\bullet(B) are both 1/6 upright. Moreover, it can be efficiently computed.
Theorem [Matsumoto-Amano ’08]
Every Clifford + T single-qubit operator W can be uniquely written in the form
W = (T | eps) (HT | SHT)* C,
where C is a Clifford.
Exact synthesis of the Clifford+T operators was solved by Kliuchinikov, Maslov, and Mosca. If You have W a 2-by-2 unitary operator, then it is Clifford+T iff the matrix elements are all of the form \frac{1}{2^{k/2}} \mathbb{Z}[\omega]) where \omega = \sqrt{i}. Moreover, if \det W = 1, then the T-count of the resulting operator is equal to 2k-2.
The upshot of this is that if you have a factoring oracle then the algorithm gives circuits of optimal T-count. In the absence of such an oracle, then this returns a nearly optimal T-count, namely the second-to-optimal T-count m plus a term of order O(\log\log 1/\epsilon).

Adam Bouland and Scott Aaronson.
Generation of Universal Linear Optics by Any Beamsplitter
abstract arXiv:1310.6718

Are there any interesting sets of beamsplitter-type quantum gates that don’t generate either only permutation matrices or arbitrary unitaries? The main result is that if one has a two-level unitary of determinant 1 and with all entries non-zero, then it densely generates SU(m) or SO(m) for m \ge 3. In other words, any beamsplitter that mixes modes is universal for three or more modes. Another way to say this is that for any beamsplitter b, either b is efficiently classically simulable or else it is universal for quantum optics. This is a nice dichotomy theorem. The proof requires the classification of the finite subgroups of SU(3), which was, surprisingly, completed only in 2013. (It was written down in 1917 as “an exercise for the reader” but educational standards have apparently come down since then.)

Isaac Kim.
On the informational completeness of local observables
abstract arXiv:1405.0137

2014 has seen some of the most exciting news for quantum conditional mutual information since strong subadditivity.
The goal of this talk is to try to find a large class of interesting states S such that quantum state tomography and quantum state verification can be done efficiently. We would like to also have that if a state is in S then we can efficiently verify that fact.
At one point in the intro, Isaac says that a point is “…for the experts, err, for the physicists.” Then he remembers who his audience is. Later he says “for this audience, I don’t have to apologize for the term ‘quantum conditional mutual information'”.
This talk proposes a class S_n that has this verification property in time O(n) where n is the number of particles. Any state in S_n is defined by a set of O(1)-particle local reduced density operators.
Isaac first reviews the approach to matrix product state tomography, which uses the parent Hamiltonian of a matrix product state to show that any (injective) MPS is completely determined by a local set of observables. This result is a generalization of the MPS tomography result to higher dimensions, but with the important difference that there is no need for a global wavefunction at all!
The main result, roughly speaking, is that there exists a certificate \epsilon(\{\rho_k \mathcal{N}_k\}) such that
|\rho - \rho'|_1 \le \epsilon(\{\rho_k \mathcal{N}_k\}).
Local density matrices can (approximately) reconstruct the global state. Previously this was done assuming that the global state is a low-bond-dimension MPS. But that’s in a sense circular because we have to make a (strong) assumption about the global state. This work allows us to do this purely using local observables. Awesome.
The talk proceeds by giving a reconstruction procedure by which a global state \rho' can be reconstructed from the marginals of a state \rho, and by showing conditions under which \rho\approx \rho'. The global state is completely determined by the local reduced density operators if the conditional quantum mutual information is zero.

Theorem 1: If \rho_{AB} = \sigma_{AB} and \rho_{BC} = \sigma_{BC} then
\frac{1}{4} \|\rho_{ABC} - \sigma_{ABC}\|_1^2 \leq I(A:C|B)_\rho + I(A:C|B)_\sigma

This is nice, but the upper bound depends on global properties of the state. (Here we think of AB and BC as “local” and ABC as “global.” Don’t worry, this will scale well to n qubits.)
To proceed we would like to upper bound the QCMI (quantum conditional mutual information) in terms of locally observable quantities. This is a straightforward but clever consequence of strong subadditivity. Specifically
I(A:C|B) \leq S(CF) - S(F) + S(BC) - S(B)
for any system F.
For judicious choice of regions, and assuming a strong form of the area law conjecture (S(A) = a |\partial A| - \gamma + o(1) in 2-d), this bound is effective.
We can then build up our reconstruction qubit-by-qubit until we have covered the entire system.

  • quantum state tomography: Measure local observables and check the locally computable upper bound. If the latter is small, then the former contains sufficient information to reconstruct the global wavefunction.
  • quantum state verification: Similar except we want to just check whether our physical state is close to a desired one

Bartek Czech, Patrick Hayden, Nima Lashkari and Brian Swingle.
The information theoretic interpretation of the length of a curve
abstract arXiv:1410.1540

At ultra-high energies, we need a theory of quantum gravity to accurately describe physics. As we move down in energy scales, vanilla quantum field theory and eventually condensed-matter physics become relevant. Quantum information theory comes into play in all three of these arenas!
One of the principal claims of this talk is that quantum information is going to be central to understanding holography, which is the key concept underlying, e.g., the AdS/CFT correspondence. Quantum gravity theories in negatively curved spacetimes (saddle-like geometries) seem to have a duality: one Hilbert space has two theories. The duality dictionary helps us compute quantities of interest in one theory and map the solutions back to the other theory. It turns out that these dictionaries often involve computing entropic quantities.
A funny feature of AdS space is that light can reach infinity and bounce back in finite time! This acts just like boundary conditions (at infinity) and simplifies certain things. Inside the bulk of AdS space, geodesics are (in the Poincare disk model) just circular arcs that meet the boundary at right angles. When there is matter in the bulk, these get deformed. We’d love to be able to interpret these geometric quantities in the bulk back in the CFT on the boundary.
The Ryu-Takayanagi correspondence links the length of a geodesic in the bulk with the entanglement entropy in the CFT on the boundary. This can be related (using e.g. work by Calabrese & Cardy) to the central charge of the CFT. A really neat application of this correspondence is a very simple and completely geometric proof of strong subadditivity due to Headrik and Takayanagi (with the important caveat that it applies only to a restricted class of states, namely those states of CFTs that have gravity duals).
The new result here is to generalize the Ryu-Takayanagi correspondence to general curves, not just geodesics. The main result: there is a correspondence between the length of convex curves in the bulk and the optimal entanglement cost of streaming teleportation communication tasks that Alice and Bob perform on the boundary.

QIP 2015 "live"-blogging, Day 3

We promise we’ll finish posting these soon! Day 3 was only a half day of talks with a free afternoon, and the rainy weather of the first two days finally subsided just in time.

Jean-Pierre Tillich
Decoding Quantum LDPC Codes

LDPC codes are families of sparse codes, meaning that the stabilizers are all low weight, and we usually also require that each qubit partakes in at most a constant number of stabilizers. That is, the parity check matrix is both row and column sparse with constant sparsity per row and column.
The classical versions of these codes are ubiquitous; your cell phone uses LDPC codes, for example. One of the principle advantages of (classical) LDPC codes is that they have good minimum distance, constant rate, and have fast nearly optimal decoders. A new variant called spatially coupled LDPC codes are a universal way (i.e., independent of the channel) to get capacity for most memoryless channels of interest with a low-complexity decoder.
Unfortunately, the quantum case is largely open still, and the speaker admitted in the first moment that he wasn’t going to solve everything during his talk. Leaving aside the question of distance of quantum LDPC codes, one of the reasons why decoding a quantum code is hard is that any error E has the same syndrome as ES if S is in the stabilizer group. This added degeneracy creates an extra freedom that is difficult to account for in a decoder. You want to find the most likely coset, rather than the most likely error.
In a classical LDPC code, the Tanner graph (the bipartite graph whose nodes are bits and checks with edges whenever a collection of bits partake in the same check) is nearly tree-like (it’s an expander) and so message passing algorithms perform nearly at capacity. In the quantum case, the Tanner graph contains many 4-cycles, and message passing decoders get stuck in loops and don’t converge.
Bravyi, Suchara, and Vargo have recently had a breakthrough where they were able to decode a very special quantum LDPC code near the hashing bound: the toric code. But didn’t we already know how to decode the toric code? Yes, of course, there are many good decoders for this code. However, none of them achieve a performance that decodes X and Z errors simultaneously with performance approximately equal to the maximum a posteriori (MAP) decoder.
There are also a new idea, pioneered for polar codes by Renes, Dupuis, and Renner and expanded by Delfosse and Tillich, to use asymmetric codes and decoders. If you are going to decode X first and then Z, why not use a stronger code for X errors, decode them first, and then use the advantage to help you decode the Z after that? Using this idea, you can provably achieve the hashing bound asymptotically.
He concludes with some bright spots: recent global analysis of local decoders (such as the recent work by Hastings on hyperbolic codes), and tricks like entanglement-assisted codes or spatially coupled quantum LDPC codes might lead to capacity achieving decoders soon.

Fernando Pastawski and Beni Yoshida.
Fault-tolerant logical gates in quantum error-correcting codes
abstract arXiv:1408.1720

A great intro: “My paper is at 1408.1720 + PRA…I forgot the full journal reference. You can look it up, which is what people usually do.” Or they just get it from the arxiv!
Fault-tolerant logical gates are central objects in quantum computation. We need them to suppress the spread of errors in computations. Unfortunately, there is a somewhat stringent no-go theorem due to Eastin and Knill that says you cannot have a universal set of transversal gates. (There are various ways around this theorem, such as recent work by Paetznick and Reichard.)
Recently, Bravyi and Koenig have generalized this work. They extend the notion of transversal gate to the idea of a constant-depth circuit and show that constant-depth circuits restrict the allowable gates performable in topological codes to certain levels of the Gottesman-Chuang hierarchy. The main idea of the proof is to use a cleaning lemma. If you can split your code into m+1 correctable regions, then you can only implement gates in level m of the Clifford hierarchy. In particular, D-dimensional topological codes can be split into D+1 correctable regions, so we can only implement gates at level D in the hierarchy.
What Yoshida and Pastawski have done is show that if you have a code with a loss tolerance threshold p_{\mathrm{loss}} > 1/n for some number n, then any transversal gate must be in P_{n-1}. If you have a P_n logic gate, then p_{\mathrm{loss}} \ge 1/n.
Another result is that if a stabilizer Hamiltonian in 3-dimensions has a fault-tolerantly implementable non-Clifford gate, then the energy barrier is constant. The proof is simple: we can split the 3-d chunk of code into correctable regions that have string-like shapes, so there must exist a string-like logical operator for these codes, hence the energy barrier is constant. Beautifully simple!
If you have a topological stabilizer code in D dimensions with an mth level logical gate, then the code distance is at most d\le O(L^{D+1-m}). This improves a bound of Bravyi-Terhal that didn’t take advantage of this m factor.
Lastly, we can consider subsystem codes. In D-dimensions, fault-tolerant gates are in P_D (as long as the code distance grows at least logarithmically in the system size).

For the remaining session, it turns out that no Pontiffs were in the audience. However, we had mostly seen these results in other venues or read them before. They are great results, so go read the papers! We hope the trackback to the arxiv is a small compensation to the authors for our failure to blog these interesting talks.

Marco Piani and John Watrous.
Einstein-Podolsky-Rosen steering provides the advantage in entanglement-assisted subchannel discrimination with one-way measurements
abstract arXiv:1406.0530

Stefan Baeuml, Matthias Christandl, Karol Horodecki and Andreas Winter.
Limitations on Quantum Key Repeaters
abstract arXiv:1402.5927

William Matthews and Debbie Leung.
On the power of PPT-preserving and non-signalling codes
abstract arXiv:1406.7142

QIP 2015 live-blogging, Day 2

From the team that brought you “QIP 2015 Day 1 liveblogging“, here is the exciting sequel. Will they build a quantum computer? Will any complexity classes collapse? Will any results depend on the validity of the Extended Riemann Hypothesis? Read on and find out!
Praise from a reader of “day 1”:
QIP 2015 liveblogging — it’s almost like being there. Maybe better.

David J. Wineland abstract
Quantum state manipulation of trapped ions

Rather than “bore us” (his words) with experimental details, Dave gave a broad-brush picture of some of the progress that his lab has made over the years at improving the coherence of quantum systems.
Dave gave a history of NIST looking for more accurate clocks. Recently, a trapped near-UV transition of Hg ions at last did better than the continually improving microwave Cs standard.
At a 1994 conference at NIST, they invited Artur Ekert to speak about quantum gates. Cirac and Zoller gave the first detailed proposal for quantum computing with a linear ion trap at about this time. They were quickly able to demonstrate one of these gates in a linear ion trap.
He showed and discussed a picture of the racetrack planar ion-trap array, where ions are moved into position to perform gates throughout the trap. They can move an manipulate the ions using a scheme due to Milburn, Schneider, James, Sorenson, and Molmer that uses position dependent dipole forces. The transverse Ising model can be simulated by applying a moving standing wave to ions in a linear trap; this is a test case for useful simulations.
Other groups at NIST have also done impressive work on quantum simulation. Bollinger’s group has made a self-assembled triangular lattice with Ising-type couplings that we talked about previously here on the Pontiff.
Everyone in the ion trap business is plagued by something called “anomalous heating”, of unknown origin, which gets worse as the length scale gets smaller. Colleagues studying surface science have suggested using an argon ion cannon (damn, that sounds impressive) to blast away impurities in the surface trap electrodes, scrubbing the surface clean. This has reduced anomalous heating 100 fold, but it’s still above all known electronic causes. Using cryogenic cooling helps too, as has been done by Ike Chuang’s group at MIT.
Laser intensity fluctuations at the site of the ions is another continual source of error. Optical and IR beams can be efficiently transmitted and positioned by optical fibers, but UV beams create color centers and degrade optical fiber on a timescale of about an hour. Recent work by the group has shown that this degradation timescale can be extended somewhat.
Dave showed a list, and there are about 30+ groups around the world working on ion-trap quantum information processing. Pretty impressive!
Dave showed this Time magazine cover that calls D-Wave the “Infinity Machine” that no one understands. In contrast, he says, we know how quantum computing works… and how it doesn’t. Sober experimentalists seem to be in rough agreement that

  • A factoring machine is decades away.
  • Quantum simulation may be possible within the next decade.
  • The real excitement will be a simulation that tells us something new about physics.

Joel Wallman and Steve Flammia
Randomized Benchmarking with Confidence
abstract arXiv:1404.6025

Randomized benchmarking is a standard method whereby experimental implementations of quantum gates can be assessed for their average-case accuracy in a way that doesn’t conflate the noise on the gates with the noise of state preparation and measurement (SPAM) errors.
The protocol is simple:

  • Choose a random sequence of m Clifford gates
  • prepare the initial state in computational basis
  • Apply the Clifford gate sequence and then the inverse gate at the end
  • Measure in computational basis.

Repeat this for many random sequences and many repetitions of the each sequence to get statistics. Under a certain noise model called the “0th order model”, the averages of this procedure for different values of m will fit to a model of the form F_m = A + B f^m where f is a quantity closely related to the average quality of the gates in the sequence. Define r to be the average error rate. (Morally, this is equivalent to “1-f”, in the above model, but the actual formula is more complicated.) To understand the convergence of this protocol to an estimate, we need to understand the variance as a function of m,r.
The main contribution is to reduce the variance bound from the trivial bound of O(1) to O(mr). This provides a good guide on how to choose optimal lengths m for experiments, and the bounds are nearly exact in the case of a single qubit. In the parameter range of interest, this improved over previous estimates of the sample complexity by three orders of magnitude.

Fernando Brandao, Marcus Cramer and Madalin Guta
A Berry-Esseen Theorem for Quantum Lattice Systems and the Equivalence of Statistical Mechanical Ensembles

The full version is not yet on the arxiv, but due to an author mistake, the above link gives the long version of QIP submission. Download it there while you still can!
Quantum many-body systems are pretty wild objects, with states in 2^{10^{23}} dimensions or even worse. But we often have a mental models of them as basically like non-interacting spins. In some cases, the renormalization group and other arguments can partially justify this. One thing that’s true in the case of non-interacting spins is that the density of states is approximately Gaussian. The idea here is to show that this still holds when we replace “non-interacting spins” with something morally similar, such as exponentially decaying correlations, bounded-range interactions, etc.
This way of writing it makes it sound trivial. But major open questions like the area law fit into this framework, and proving most statements is difficult. So technical advances in validating our “finite correlation length looks like non-interacting spins” intuition can be valuable.
Today’s technical advance is a quantum version of the Berry-Esseen theorem. The usual Berry-Esseen theorem gives quantitative bounds on the convergence to the mean that we get from the central limit theorem. Here we consider a lattice version, where we consider spins on a d-dimensional lattice and local observables A and B that act on subsets of spins separated by a distance L. We require a finite correlation length, as we get for example, for all Gibbs states above some critical temperature (or at any nonzero temperature in D=1).
What does a (quantitative) CLT give us beyond mere large deviation bounds? It shows that the density of states (at least those inhabited by the particular state \rho) is roughly Gaussian thereby roughly matching what we would get from a tensor power state. This is somewhat stronger than the “typical subspace”-type guarantees that we would get from a large deviation bounds.
The main application here is an equivalence theorem between the canonical and microcanonical ensembles: i.e. between the Gibbs state and a uniform mixture over an energy band of width O(\sqrt N). These states are far apart in trace distance, but this paper shows that they look similar with respect to sufficiently local observables. If you think this sounds easy, well, then try to prove it yourself, and then once you give up, read this paper.

Michael Kastoryano and Fernando Brandao
Quantum Gibbs Samplers: the commuting case
abstract arXiv:1409.3435

How efficiently can we prepare thermal states on a quantum computer? There is a related question: how does nature prepare states? That is, what is the natural rate for thermalization given a quantum lattice system? There are two possible ways to model thermalization, both of which are computationally efficient. “Davies generators” mean local jumps that can be modeled as local interactions with a Markovian bath at a fixed temperature, while “heat-bath generators” mean that we repeatedly apply the Petz recovery map to small blocks of spins. Call both “Gibbs samplers.”
Consider the setting where you have a system living on a lattice with a bit or qubit on each site, and some memoryless, spatially local, dynamics. Classically the powerful tools of DLR (Dobrushin-Lanford-Ruelle) theory imply a close relation between properties of the dynamics and properties of the stationary state. Specifically, spatial mixing (meaning decaying correlations in the stationary state) can be related to temporal mixing (meaning that the dynamics converge rapidly to the stationary state). (The best reference I know is Martinelli, but for a more CS-friendly version, see also this paper.)
An exact quantum analogy to this cannot be reasonably defined, since the classical definition involves conditioning – which often is the reason classical information theory ideas fail to translate into the quantum case.
One of the first contributions of this work then is to define quantum notions of “weak clustering” (more or less the familiar exponential decay of correlations between well-separated observables) and “strong clustering” (a more complicated definition involving overlapping regions). Then the main result is that there is an intimate connection between the rate of convergence of any quantum algorithm for reaching the Gibbs state and the correlations in the Gibbs state itself. Namely: strong clustering (but not weak clustering) is equivalent to rapid mixing of the Gibbs sampler. Everything here assumes commuting Hamiltonians, by the way. Also, “rapid mixing” is equivalent to the Gibbs sampler being gapped (think of this like the quantum version of being a gapped Markov chain).
One direction is fairly straightforward. To show that strong clustering implies a gapped Gibbs sampler, we directly apply the variational characterization of the gap. (The dynamics of a continuous-time Gibbs sampler can be written as \dot\rho = -\mathcal{A}[\rho] for some linear superoperator \mathcal{A}, which we will assume to be Hermitian for convenience. \mathcal{A} has all nonnegative eigenvalues because it is stable, and it has a single eigenvalue equal to 0, corresponding to the unique stationary distribution. The gap is given by the smallest positive eigenvalue, and this “smallest” is what gives rise to the variational characterization. See their paper for details.) The variational calculation involves a minimization over (global) states and strong clustering lets us reduce this to calculations involving local states that are much easier to bound.
In the other direction (gap implies strong clustering), we relate the Gibbs sampler to a local Hamiltonian, and use the detectability lemma, which in fact was originally used in part to prove a statement about decay of correlations. The idea is to construct an AGSP (approximate ground-state projector) which is a low-degree polynomial of the Hamiltonian. Because it’s low degree, applying it does not increase the entanglement across any cut by much (useful for proving area laws) or does not propagate correlations far (along the lines of Lieb-Robinson; useful for correlation decay).
When can these results be applied? In 1-D, strong and weak clustering are equivalent (because boundary terms can be removed), and therefore both are implied by (Hamiltonian) gap. Also in any number of spatial dimensions, above a universal critical temperature the Gibbs samplers are always gapped.
Some open questions:

  • If in 2-D, one could also show strong=weak clustering (as is known classically in <3 dimensions), it would nail the coffin of 2d quantum memory for any commuting Hamiltonian.
  • Classically, there is a dichotomy result: either there is very rapid mixing (log(N) time) or very slow (exp(N)) time. Here they can only get poly(N) mixing. Can these results be extended to the log-Sobolev type bounds that give this type of result?

Mehmet Burak Şahinoğlu, Dominic Williamson, Nick Bultinck, Michael Marien, Jutho Haegeman, Norbert Schuch and Frank Verstraete
Characterizing Topological Order with Matrix Product Operators
Oliver Buerschaper
Matrix Product Operators: Local Equivalence and Topological Order
abstract-137 arXiv:1409.2150 abstract-176

Characterizing topological quantum order is a challenging problem in many-body physics. In two dimensions, it is generally accepted that all topologically ordered ground states are described (in a long-range limit) by a theory of anyons. These anyonic theories have characteristic features like topology-dependent degeneracy and local indistinguishability in the ground space and string-like operators that map between these ground states.
The most famous example of this is Kitaev’s toric code, and we are interested in it at a quantum information conference because of its ability to act as a natural quantum error-correcting code. The four ground states of the toric code can be considered as a loop gas, where each ground state is a uniform superposition of all loops on the torus satisfying a given parity constraint.
The goal in this talk is to classify types of topological order using the formalism of matrix product states, and their slightly more general cousins, matrix product operators (MPO). The authors define an algebra for MPOs that mimics the algebra of loop operators in a topologically ordered material. Because matrix product operators have efficient descriptions classically, they are well suited to numerical studies, and their structure also allows them to be used for analytical investigations.
The main idea that the authors introduce is a condition on MPO operators so that they behave like topological operators. In particular, they obey a “deformation” condition that lets them be pushed around the lattice, just like Wilson loops.
The authors used this idea to study models that are not stabilizer codes, such as the double semion model and more generally the class of string-net models. This looks like a very promising tool for studying topological order.

Dorit Aharonov, Aram Harrow, Zeph Landau, Daniel Nagaj, Mario Szegedy and Umesh Vazirani
Local tests of global entanglement and a counterexample to the generalized area law
abstract 1410.0951

Steve: “Counterexamples to the generalized area law” is an implicit admission that they just disproved something that nobody was conjecturing in the first place. 😉
Aram: I’ll blog about this later.

Xiaotong Ni, Oliver Buerschaper and Maarten Van Den Nest
A non-commuting Stabilizer Formalism
abstract arXiv:1404.5327

This paper introduces a new formalism called the “XS stabilizer” formalism that allows you to describe states in an analogous way to the standard stabilizer formalism, but where the matrices in the group don’t commute. The collection of matrices is generated by X, S, \alpha, where \alpha = \sqrt{i} and S = \sqrt{Z} on n qubits. A state or subspace that is stabilized by a subgroup of these operators is said to be an XS stabilizer state or code. Although these are, as Xiaotong says, “innocent-looking tensor product operators”, the stabilizer states and codes can be very highly entangled.
One of the applications of this formalism is to classify the double semion model, which is a local Hamiltonian model with topological order. There are sets of general conditions for when such states and codes can be ground states of local commuting XS Hamiltonians. Unfortunately, not all of these properties can be computed efficiently; some of these properties are NP-complete to compute. There are some interesting open questions here, for example what class of commuting projector Hamiltonians ground states are in NP?

Dave Touchette
Direct Sum Theorem for Bounded Round Quantum Communication Complexity and a New, Fully Quantum Notion of Information Complexity (Recipient of the QIP2015 Best Student Paper Prize)
abstract arXiv:1409.4391

“Information complexity” is a variant of communication complexity that measures not the number of bits exchanged in a protocol but the amount of “information”, however that is defined. Here is a series of tutorials for the classical case. Entropy is one possibility, since this would put an upper bound on the asymptotic compressibility of many parallel repetitions of a protocol. But in general this gives up too much. If Alice holds random variables AC, Bob holds random variable B and Alice wants to send C to Bob then the cost of this is (asymptotically) I(A:C|B).
This claim has a number of qualifications. It is asymptotic and approximate, meaning that it holds in the limit of many copies. However, see 1410.3031 for a one-shot version. And when communication is measured in qubits, the amount is actually \frac{1}{2} I(A:C|B).
Defining this correctly for multiple messages is tricky. In the classical case, there is a well-defined “transcript” (call it T) of all the messages, and we can define information cost as I(X:T|Y) + I(Y:T|X), where X,Y are the inputs for Alice and Bob respectively. In the quantum case we realize that the very idea of a transcript implicitly uses the principle that (classical) information can be freely copied, and so for quantum protocols we cannot use it. Instead Dave just sums the QCMI (quantum conditional mutual information) of each step of the protocol. This means I(A:M|B) when Alice sends M to Bob and I(B:M|A) when Bob sends A to Alice. Here A,B refer to the entire systems of Alice/Bob respectively. (Earlier work by Yao and Cleve-Buhrman approached this in other, less ideal, ways.)
When minimized over all valid protocols, Dave’s version of Quantum Information Complexity represents exactly the amortized quantum communication complexity. This sounds awesome, but there are a bunch of asterisks. First “minimized over all valid protocols,” is an unbounded minimization (and some of these protocols really do use an infinite number of rounds), although it is in a sense “single-shot” in that it’s considering only protocols for calculating the function once. Also “amortized” here is not quite the same as in Shannon theory. When we talk about the capacity of a channel or its simulation cost (as in these sense of reverse Shannon theorems) we usually demand that the block error rate approach zero. In this case, the information complexity is defined in terms of an error parameter \epsilon (i.e. it is the minimum sum of QCMI’s over all protocols that compute the function up to error \epsilon). This then corresponds to the asymptotic cost of simulating a large number of evaluations of the function, each of which is allowed to err with probability \epsilon. The analogue in Shannon theory is something called rate-distortion theory.
Before you turn up your nose, though, the current talk gets rid of this amortized restriction. QIC (quantum information complexity) is easily seen to be a lower bound for the communication complexity and this work shows that it is also an upper bound. At least up to a multiplicative factor of 1/\epsilon^2 and an additive term that also scales with the number of rounds. Since QIC is also a lower bound for the above amortized version of complexity, this proves a direct sum theorem, meaning that computing n function values costs \Omega(n) as much as one function evaluation. Here the weak amortized definition actually makes the result stronger, since we are proving lower bounds on the communication cost. In other words, the lower bound also applies to the case of low block-wise error.
The technical tools are the one-shot redistribution protocol mentioned above (see also this version) and the Jain-Radhakrishnan-Sen substate theorem (recently reproved in 1103.6067 and the subject of a press release that I suppose justifies calling this a “celebrated” theorem). I should write a blog post about how much I hate it when people refer to “celebrated” theorems. Personally I celebrate things like Thanksgiving and New Year’s, not the PCP theorem. But I digress.

Toby Cubitt, David Elkouss, William Matthews, Maris Ozols, David Perez-Garcia and Sergii Strelchuk
Unbounded number of channel uses are required to see quantum capacity
abstract arXiv:1408.5115

Is the quantum capacity of a quantum channel our field’s version of string theory? Along the lines of this great Peter Shor book review, quantum Shannon theory has yielded some delightful pleasant surprises, but our attempts to prove an analogue of Shannon’s famous formula C=\max_p I(A:B) has turned into a quagmire that has now lasted longer than the Vietnam War.
Today’s talk is the latest grim news on this front. Yes, we have a capacity theorem for the (unassisted) quantum capacity, the famous LSD theorem, but it requires “regularization” meaning maximizing a rescaled entropic quantity over an unbounded number of channel uses. Of course the definition of the capacity itself involves a maximization over an unbounded number of channel uses, so formally speaking we are not better off, although in practice the capacity formula can often give decent lower bounds. On the other hand, we still don’t know if it is even decidable.
Specifically the capacity formula is
\displaystyle Q = \lim_{n\rightarrow\infty} Q^{(n)} := \lim_{n\rightarrow\infty} \frac{1}{n} \max_\rho I_c(\mathcal{N}^{\otimes n}, \rho),
where \rho is maximized over all inputs to n uses of the channel and I_c is the coherent information (see paper for def). In evaluating this formula, how large do we have to take n? e.g. could prove that we always have Q^{(n)} \geq (1-1/n)Q? If this, or some formula like it, were true then we would get an explicit upper bound on the complexity of estimating capacity.
The main result here is to give us bad news on this front, in fairly strong terms. For any n they define a channel for which Q^{(n)}=0 but Q>0.
Thus we need an unbounded number of channel uses to detect whether the quantum capacity (ie the regularized coherent information) is even zero or nonzero.
The talk reviews other non-additivity examples, including classical, private, zero-error quantum and classical capacities. Are there any good review articles here?
Here’s how the construction works. It builds on the Smith-Yard superactivation result, which combines an erasure channel (whose lack of capacity follows from the no-cloning theorem) and a PPT channel (whose lack of capacity follows from being PPT). The PPT channel is chosen to be able to send private information (we know these exist from a paper by H3O) and by using the structure of these states (further developed in later three-Horodecki-and-an-Oppenheim work), one can show that combining this with an erasure channel can send some quantum information. Specifically the PPT channel produces a “shield” which, if faithfully transmitted to Bob, enables perfect quantum communication.
This new construction is similar but uses a shield with many parts any one of which can be used to extract a valid quantum state. On the other hand, the erasure probability is increased nearly to one, and noise is added as well. Proving this is pretty tough and involves sending many things to zero or infinity at varying rates.
During question period prolonged jocular discussion triggered by John Smolin saying title was inappropriate, since the authors had clearly shown that by examining the parameters of the channel the quantum capacity was positive, so detecting positivity of capacity required no channel uses.
D. Gottesman suggested a more operational interpretation of title, given a black box, how many uses of it are needed to decide whether its quantum capacity was positive. If it was, e.g. an erasure channel with erasure probability very close to 1/2, arbitrarily many uses would be needed to confidently decide. It’s not clear how to formalize this model.
By the way, better Shannon theory news is coming in a few days for bosonic channels with the talk by Andrea Mari.

QIP 2015 Return of the Live-blogging, Day 1

Jan 14 update at the end.

The three Pontiffs are reunited at QIP 2015 and, having forgotten how painful liveblogging was in the past, are doing it again. This time we will aim for some slightly more selective comments.

In an ideal world the QIP PC would have written these sorts of summaries and posted them on scirate, but instead they are posted on easychair where most of you can’t access them. Sorry about this! We will argue at the business meeting for a more open refereeing process.

The first plenary talk was:

Ran Raz (Weizmann Institute)
How to Delegate Computations: The Power of No-Signaling Proofs

Why is the set of no-signalling distributions worth looking at? (That is, the set of conditional probability distributions p(a,b|x,y) that have well-defined marginals p(a|x) and p(b|y).) One way to think about it is as a relaxation of the set of “quantum” distributions, meaning the input-output distributions that are compatible with entangled states. The no-signalling polytope is defined by a polynomial number of linear constraints, and so is the sort of relaxation that is amenable to linear programming, whereas we don’t even know whether the quantum value of a game is computable. But is the no-signalling condition ever interesting in itself?

Raz and his coauthors (Yael Kalai and Ron Rothblum) prove a major result (which we’ll get to below) about the computational power of multi-prover proof systems where the provers have access to arbitrary non-signalling distributions. But they began by trying to prove an apparently unrelated classical crypto result. In general, multiple provers are stronger than one prover. Classically we have MIP=NEXP and IP=PSPACE, and in fact that MIP protocol just requires one round, whereas k rounds with a single prover is (roughly) within the k’th level of the polynomial hierarchy (i.e. even below PSPACE). So simulating many provers with one prover seems in general crazy.

But suppose instead the provers are computationally limited. Suppose they are strong enough for the problem to be interesting (i.e. they are much stronger than the verifier, so it is worthwhile for the verifier to delegate some nontrivial computation to them) but to weak to break some FHE (fully homomorphic encryption) scheme. This requires computational assumptions, but nothing too outlandish. Then the situation might be very different. If the verifier sends its queries using FHE, then one prover might simulate many provers without compromising security. This was the intuition of a paper from 2000, which Raz and coauthors finally are able to prove. The catch is that even though the single prover can’t break the FHE, it can let its simulated provers play according to a no-signalling distribution. (Or at least this possibility cannot be ruled out.) So proving the security of 1-prover delegated computation requires not only the computational assumptions used for FHE, but also a multi-prover proof system that is secure against no-signalling distributions.

Via this route, Raz and coauthors found themselves in QIP territory. When they started it was known that

This work nails down the complexity of the many-prover setting, showing that EXP is contained in MIPns[poly provers], so that in fact that classes are equal.

It is a nice open question whether the same is true for a constant number of provers, say 3. By comparison, three entangled provers or two classical provers are strong enough to contain NEXP.

One beautiful consequence is that optimizing a linear function over the no-signalling polytope is roughly a P-complete problem. Previously it was known that linear programming was P-complete, meaning that it was unlikely to be solvable in, say, log space. But this work shows that this is true even if the constraints are fixed once and for all, and only the objective function is varied. (And we allow error.) This is established in a recent followup paper [ECCC TR14-170] by two of the same authors.

Francois Le Gall.
Improved Quantum Algorithm for Triangle Finding via Combinatorial Arguments
abstract arXiv:1407.0085

A technical tour-de-force that we will not do justice to here. One intriguing barrier-breaking aspect of the work is that all previous algorithms for triangle finding worked equally well for the standard unweighted case as well as a weighted variant in which each edge is labeled by a number and the goal is to find a set of edges (a,b), (b,c), (c,a) whose weights add up to a particular target. Indeed this algorithm has a query complexity for the unweighted case that is known to be impossible for the weighted version. A related point is that this shows the limitations of the otherwise versatile non-adaptive learning-graph method.

Ryan O’Donnell and John Wright
Quantum Spectrum Testing
abstract arXiv:1501.05028

A classic problem: given \rho^{\otimes n} for \rho an unknown d-dimensional state, estimate some property of \rho. One problem where the answer is still shockingly unknown is to estimate \hat\rho in a way that achieves \mathbb{E} \|\rho-\hat \rho\|_1 \leq\epsilon.
Results from compressed sensing show that n = \tilde\Theta(d^2r^2) for single-copy two-outcome measurements of rank-r states with constant error, but if we allow block measurements then maybe we can do better. Perhaps O(d^2/\epsilon) is possible using using the Local Asymptotic Normality results of Guta and Kahn [0804.3876], as Hayashi has told me, but the details are – if we are feeling generous – still implicit. I hope that he, or somebody, works them out. (18 Jan update: thanks Ashley for fixing a bug in an earlier version of this.)

The current talk focuses instead on properties of the spectrum, e.g. how many copies are needed to distinguish a maximally mixed state of rank r from one of rank r+c? The symmetry of the problem (invariant under both permutations and rotations of the form U^{\otimes n}) means that we can WLOG consider “weak Schur sampling” meaning that we measure which S_n \times U_d irrep our state lies in, and output some function of this result. This irrep is described by an integer partition which, when normalized, is a sort of mangled estimate of the spectrum. It remains only to analyze the accuracy of this estimator in various ways. In many of the interesting cases we can say something nontrivial even if n= o(d^2). This involves some delicate calculations using a lot of symmetric polynomials. Some of these first steps (including many of the canonical ones worked out much earlier by people like Werner) are in my paper quant-ph/0609110 with Childs and Wocjan. But the current work goes far far beyond our old paper and introduces many new tools.

Han-Hsuan Lin and Cedric Yen-Yu Lin. Upper bounds on quantum query complexity inspired by the Elitzur-Vaidman bomb tester
abstract arXiv:1410.0932

This talk considers a new model of query complexity inspired by the Elitzur-Vaidman bomb tester. The bomb tester is a classic demonstration of quantum weirdness: You have a collection of bombs that have a detonation device so sensitive that even a single photon impacting it will set it off. Some of these bombs are live and some are duds, and you’d like to know which is which. Classically, you don’t stand a chance, but quantum mechanically, you can put a photon into a beamsplitter and place the bomb in one arm of a Mach-Zender interferometer. A dud will destroy the interference effects, and a homodyne detector will always click the same way. But you have a 50/50 chance of detecting a live bomb if the other detector clicks! There are various tricks that you can play related to the quantum Zeno effect that let you do much better than this 50% success probability.

The authors define a model of query complexity where one risks explosion for some events, and they showed that the quantum query complexity is related to the bomb query complexity by B(f) = \Theta(Q(f)^2). There were several other interesting results in this talk, but we ran out of steam as it was the last talk before lunch.

Kirsten Eisentraeger, Sean Hallgren, Alexei Kitaev and Fang Song
A quantum algorithm for computing the unit group of an arbitrary degree number field
STOC 2014

One unfortunate weakness of this work: The authors, although apparently knowledgeable about Galois theory, don’t seem to know about this link.

The unit group is a fundamental object in algebraic number theory. It comes up frequently in applications as well, and is used for fully homomorphic encryption, code obfuscation, and many other things.

My [Steve] personal way of understanding the unit group of a number field is that it is a sort of gauge group with respect to the factoring problem. The units in a ring are those numbers with multiplicative inverses. In the ring of integers, where the units are just \pm1 , we can factor composite numbers into 6 = 3 \times 2 = (-3)\times (-2). Both of these are equally valid factorizations; they are equivalent modulo units. In more complicated settings where unique factorization fails, we have factorization into prime ideals, and the group of units can in general become infinite (though always discrete).

The main result of this talk is a quantum algorithm for finding the unit group of a number field of arbitrary degree. One of the technical problems that they had to solve to get this result was to solve the hidden subgroup problem on a continuous group, namely \mathbb{R}^n.

The speaker also announced some work in progress: a quantum algorithm for the principal ideal problem and the class group problem in arbitrary degree number fields [Biasse Song ‘14]. It sounds like not all the details of this are finished yet.

Dominic Berry, Andrew Childs and Robin Kothari
Hamiltonian simulation with nearly optimal dependence on all parameters
abstract 1501.01715

Hamiltonian simulation is not only the original killer app of quantum computers, but also a key subroutine in a large and growing number of problems. I remember thinking it was pretty slick that higher-order Trotter-Suzuki could achieve a run-time of \|H\|t\text{poly}(s)(\|H\|t/\epsilon)^{o(1)} where t is the time we simulate the Hamiltonian for and s is the sparsity. I also remember believing that the known optimality thoerems for Trotter-Suzuki (sorry I can’t find the reference, but it involves decomposing e^{t(A+B)} for the free Lie algebra generated by A,B) meant that this was essentially optimal.

Fortunately, Berry, Childs and Kothari (and in other work, Cleve) weren’t so pessimistic, and have blasted past this implicit barrier. This work synthesizes everything that comes before to achieve a run-time of \tau \text{poly}\log(\tau/\epsilon) where \tau = \|H\|_{\max}st (where \|H\|_{\max} is \max_{i,j} |H_{i,j}| can be related to the earlier bounds via \|H\| \leq d \|H\|_{\max}).

One quote I liked: “but this is just a generating function for Bessel functions!” Miraculously, Dominic makes that sound encouraging. The lesson I suppose is to find an important problem (like Hamiltonian simulation) and to approach it with courage.

Salman Beigi and Amin Gohari
Wiring of No-Signaling Boxes Expands the Hypercontractivity Ribbon
abstract arXiv:1409.3665

If you have some salt water with salt concentration 0.1% and some more with concentration 0.2%, then anything in the range [0.1, 0.2] is possible, but no amount of mixing will give you even a single drop with concentration 0.05% or 0.3%, even if you start with oceans at the initial concentrations. Similarly if Alice and Bob share an unlimited number of locally unbiased random bits with correlation \eta they cannot produce even a single bit with correlation \eta' > \eta if they don’t communicate. This was famously proved by Reingold, Vadhan and Wigderson.

This talk does the same thing for no-signaling boxes. Let’s just think about noisy PR boxes to make this concrete. The exciting thing about this work is that it doesn’t just prove a no-distillation theorem but it defines an innovative new framework for doing so. The desired result feels like something from information theory, in that there is a monotonicity argument, but it needs to use quantities that do not increase with tensor product.

Here is one such quantity. Define the classical correlation measure \rho(A,B) = \max \text{Cov}(f,g) where f:A\mapsto \mathbb{R}, g:B\mapsto \mathbb{R} and each have variance 1. Properties:

  • 0 \leq \rho(A,B) \leq 1
  • \rho(A,B) =0 iff p_{AB} = p_A \cdot p_B
  • \rho(A^n, B^n) = \rho(A,B)
  • for any no-signaling box, \rho(A,B) \leq \max(\rho(A,B|X,Y), \rho(X,Y))

Together this shows that any wiring of boxes cannot increase this quantity.

The proof of this involves a more sophisticated correlation measure that is not just a single number but is a region called the hypercontractivity ribbon (originally due to [Ahlswede, Gacs ‘76]). This is defined to be the set of (\lambda_1, \lambda_2) such that for any f,g we have
\mathbb{E}[f_A g_B] \leq \|f_A\|_{\frac{1}{\lambda_1}} \|g_B\|_{\frac{1}{\lambda_2}}
A remarkable result of [Nair ‘14] is that this is equivalent to the condition that
I(U;AB) \geq \lambda_1 I(U:A) + \lambda_2 I(U:B)
for any extension of the distribution on AB to one on ABU.

Some properties.

  • The ribbon is [0,1]\times [0,1] iff A,B are independent.
  • It is stable under tensor power.
  • monotonicity: local operations on A,B enlarge R

For boxes define R(A,B|X,Y) = \cap_{x,y} R(A,B|x,y). The main theorem is then that rewiring never shrinks hypercontractivity ribbon. And as a result, PR box noise cannot be reduced.

These techniques are beautiful and seem as though they should have further application.

Masahito Hayashi
Estimation of group action with energy constraint
abstract arXiv:1209.3463

Your humble bloggers were at this point also facing an energy constraint which limited our ability to estimate what happened. The setting is that you pick a state, nature applies a unitary (specifically from a group representation) and then you pick a measurement and try to minimize the expected error in estimating the group element corresponding to what nature did. The upshot is that entanglement seems to give a quadratic improvement in metrology. Noise (generally) destroys this. This talk showed that a natural energy constraint on the input also destroys this. One interesting question from Andreas Winter was about what happens when energy constraints are applied also to the measurement, along the lines of 1211.2101 by Navascues and Popescu.

Jan 14 update: forgot one! Sorry Ashley.

Ashley Montanaro
Quantum pattern matching fast on average

Continuing the theme of producing shocking and sometimes superpolynomial speedups to average-case problems, Ashley shows that finding a random pattern of length m in a random text of length n can be done in quantum time \tilde O(\sqrt{n/m}\exp(\sqrt{\log m})). Here “random” means something subtle. The text is uniformly random and the pattern is either uniformly random (in the “no” case) or is a random substring of the text (in the “yes” case). There is also a higher-dimensional generalization of the result.

One exciting thing about this is that it is a fairly natural application of Kuperberg’s algorithm for the dihedral-group HSP; in fact the first such application, although Kuperberg’s original paper does mention a much less natural such variant. (correction: not really the first – see Andrew’s comment below.)
It is interesting to think about this result in the context of the general question about quantum speedups for promise problems. It has long been known that query complexity cannot be improved by more than a polynomial (perhaps quadratic) factor for total functions. The dramatic speedups for things like the HSP, welded trees and even more contrived problems must then use the fact that they work for partial functions, and indeed even “structured” functions. Pattern matching is of course a total function, but not one that will ever be hard on average over a distribution with, say, i.i.d. inputs. Unless the pattern is somehow planted in the text, most distributions simply fail to match with overwhelming probability. It is funny that for i.i.d. bit strings this stops being true when m = O(\log n), which is almost exactly when Ashley’s speedup becomes merely quadratic. So pattern matching is a total function whose hard distributions all look “partial” in some way, at least when quantum speedups are possible. This is somewhat vague, and it may be that some paper out there expresses the idea more clearly.
Part of the strength of this paper is then finding a problem where the promise is so natural. It gives me new hope for the future relevance of things like the HSP.

QIP 2012 Day 5

The quantum pontiff brains have reached saturation.

Eric Chitambar, Wei Cui and Hoi-Kwong Lo:
Increasing Entanglement by Separable Operations and New Monotones for W-type Entanglement

These results demonstrate a large quantitative gap between LOCC and SEP for a particular task called random EPR distillation. Therefore, SEP may not always be a good approximation for LOCC. They also demonstrate entanglement monotones that can increase under SEP, the first known examples with analytic functions. They also show that LOCC is not a closed set of operations, so optimal LOCC protocols may not exist.
Recall how LOCC works: Alice and Bob share a bipartite pure state |psirangle_{AB}. Alice makes a measurement on her system and sends some classical bits to Bob. Bob then makes a measurement on his system and sends some bits to Alice. They repeat this many times. Any LOCC operation is a collection of maps {mathcal{E}_j} such that the sum of the maps is trace preserving and each map has a separable Kraus decomposiiton where each operator can be build from successive rounds of local measurements. This is a pretty difficult condition to study, so it is convenient to relax LOCC to SEP. For SEP, we drop the “successive rounds of local measurements” requirement on the Kraus operators. Given an arbitrary SEP, we can always implement it with LOCC if we allow for some probability of failure, i.e. SLOCC. Can we eliminate this probability of failure? Not in general. There are maps that are in SEP but not in LOCC, as first demonstrated by [Bennett et al.] (“Quantum nonlocality without entanglement”), and subsequently investigated by many other authors.
So now we know that SEP and LOCC are not equal, but how non-equal are they? That is, can we quantify the gap between the two classes? There is some previous work, such as Bennett et al., who showed that the mutual information for state discrimination had a gap of at least 10^{-6}, and work by Koashi et al. who showed that there is a gap in the success probability for unambiguously distinguishing states was at least 0.8%. These gaps look rather small, so you might plausibly conjecture that SEP is a good approximation for LOCC in general.
SEP is precisely the class of operations that cannot create entanglement out of nothing, but if we seed things with a little bit of entanglement, can we use SEP to increase the entanglement? We need to define some LOCC monotone to make this a meaning statement.
Surprisingly, yes! There are entanglement transforms that work in SEP but not LOCC, and therefore entanglement monotones (but artificial ones) that can increase under SEP but not LOCC. To give some intuition, though, here is a non-artificial task.
Random-party distillation of W-class states:
An N-partite W-class state looks (up to local unitary rotation) like
|vec{x}rangle = sqrt{x_0} |00ldots 00rangle + sqrt{x_1} |10ldots 00rangle + sqrt{x_2} |01ldots 00rangle + sqrt{x_n} |00ldots 01rangle.
It’s a good class of states to study because they are only parameterized by N real numbers (as opposed to 2N), it is easy to characterize how the states transform under local measurements, and it is closed under SLOCC. See this paper by Kintas and Turgut for a review of the properties of W-class states.
Here is a nice example, due to Fortescue and Lo. Start with a tripartite W state, and have each party perform the measurement {M_0, M_1}, with M_0 = sqrt{1-epsilon}|0ranglelangle 0|+|1ranglelangle 1| and M_1 = sqrt{epsilon}|0ranglelangle 0|, then broadcast the result. If the outcomes are 0,0,0, then nothing happens. If the outcomes are 1,0,0 or 0,1,0 or 0,0,1, then the two parties measuring 0 are left with an EPR pair; hence this achieves random-party EPR distillation. If there are two or three M_1 outcomes, then the entanglement is lost. However, as epsilonrightarrow 0, then the probability of this happening goes to zero, while the number of rounds go up. Intriguingly, this is evidence that the set of LOCC operations is not closed (i.e. does not contain all of its limit points), but previously that was not proven. Of course, we can also generalize this to the N-partite setting.
This can be generalized by using local filtering to first (probabilistically) map a W-class state to one with x_1=cdots=x_N. In fact, the resulting probability of success is optimal (see paper), and thus this optimal probability of EPR distillation is an entanglement monotone.
The fact that they establish an entanglement monotone means they get an awesome corollary. There is a parameter kappa which represents the success probability of distillation of an EPR pair involving alice. They give an explicit formula for it, and prove that it decreases for any measurement made by Alice that changes the state. Thus, they prove that:

LOCC is not closed!

Here is a great open question (not in the talk): Find a similar monotone that describes data hiding, for example to improve the analysis of these states.
For the multipartite setting, there are a few more ideas. There is a single-copy “combing transformation” (analogous to the one of Yang & Eisert), which transforms all the entanglement to bipartite entanglement shared with Alice. Again SEP is better than LOCC, in ways that can be quantified.
Some open problems:

  • What about the case of x_0 not= 0 W-states? They have some partial results, but it still remains open.
  • Can the gap between SEP and LOCC be increased arbitrarily?
  • Can one apply these ideas to other entanglement classes?
  • Do there exist similar phenomena in bipartite systems?
  • How much entanglement is required to implement SEP operations?

Rodrigo Gallego, Lars Erik Würflinger, Antonio Acín and Miguel Navascués:
Quantum correlations require multipartite information principles
(merged with)
An operational framework for nonlocality

Normally, we define “3-partite entanglement” to mean states that are not separable with respect to any bipartition; equivalently, they cannot be created by LOCC even if two parties get together. But this definition can be dangerous when we are considering nonlocality, as we will see below.
Nonlocality is a resource for device-independent information protocols. Define a local joint probability distribution to be one which satisfies P(ab|xy) =  sum_lambda p(lambda) P(a|x) P(b|y) .
Nonlocality is often described in terms of boxes that mimic measurements on entangled states. But entanglement can also be manipulated, e.g. by LOCC. What is the analogue for boxes? Are their variants of entanglement swapping, distillation, dilution, etc? In most cases, when we make the boxes stronger, the dynamics get weaker, often becoming trivial.
They define a new class or operations called WPICC, which stands for something about rewirings and pre- and post-processing (note that non-local boxes can be rewired in ways that depend on their outputs, so their causal structure can be tricky). WPICC are the operations which map local joint probability distributions to local joint probability distributions, and shouldn’t be able to create nonlocal correlations; indeed, we can use them to define nonlocal correlations, just as entanglement is defined as the set of states that can’t be created by LOCC.
However, with these operations, operations on two parties can create tripartite nonlocality, so this approach to defining nonlocality doesn’t work.
Instead, define a box to be tripartite nonlocal if it doesn’t have a TOBL (time-ordered bilocal) structure, meaning a Markov-like condition that’s described in the paper.
Moral of the story? If you want a sensible definition of tripartite correlations, then beware of operational definitions analogous to LOCC, and focus on mathematical ones, analogous to SEP.

Martin Schwarz, Kristan Temme, Frank Verstraete, Toby Cubitt and David Perez-Garcia:
Preparing projected entangled pair states on a quantum computer

There are lots of families of states which seem useful for giving short classical descriptions of highly entangled quantum states. For example, matrix product states are provably useful for describing the ground states of gapped one-dimensional quantum systems. More generally there are other classes of tensor network states, but their utility is less well understood. A prominent family of states for trying to describe ground states of 2d quantum systems is projected entangled pair states (PEPS). These are just tensor network states with a 2d grid topology. The intuition is that the structure of correlations in the PEPS should mimic the correlation in the ground state of a gapped system in 2d, so they might be a good ansatz class for variational methods.
If you could prepare an arbitrary PEPS, what might you be able to do with that? Schuch et al. proved that an oracle that could prepare an arbitrary PEPS would allow you to solve a PP complete problem. We really can’t hope to do this efficiently, so it begs the question: what class of PEPS can we prepare in BQP? That’s the central question that this talk addresses.
We need a few technical notions from the theory of PEPS. If we satisfy a technical condition called injectivity, then we can define (via a natural construction) a Hamiltonian called the parent Hamiltonian for which the frustration-free ground state is uniquely given by the PEPS. This injectivity condition is actually generic, and many natural families of states satisfy it, so intuitively it seems to be a reasonable restriction. (It should be said, however, that there are also many natural states which don’t satisfy injectivity, for example GHZ states.)
The algorithm is to start from a collection of entangled pairs and gradually “grow” a PEPS by growing the parent Hamiltonian term-by-term. If we add a term to the Hamiltonian, then we can try to project back to the ground space. This will be probabilistic and will require some work to get right. Then we can add a new term, etc. The final state is guaranteed to be the PEPS by the uniqueness of the ground state of an injective parent Hamiltonian.
In order to get the projection onto the new ground state after adding a new term to the Hamiltonian, we can use phase estimation. If you get the right measurement outcome (you successfully project onto the zero-energy ground space), then great! Just keep going. But if not, then you can undo the measurement with the QMA amplification protocol of Marriott and Watrous.
The bound on the run time is governed by a polynomial in the condition number of the PEPS projectors and the spectral gap of the sequence of parent Hamiltonians, as well as the number of vertices and edges in the PEPS graph.
This can also be generalized to PEPS which are so-called G-injective. This condition allows the method to be generalized to PEPS which have topological order, where the PEPS parent Hamiltonian has a degenerate ground space.
Good question by Rolando Somma: What happens if the ground state is stoquastic? Can you get any improvements?

Esther Hänggi and Marco Tomamichel:
The Link between Uncertainty Relations and Non-Locality

The main result: Nonlocality, which means achieving Bell value beta,
implies an uncertainty relation, namely, the incompatible bases used in the Bell experiment have overlap at most c^*=f(beta).
This is a sort of converse to a result of Oppenheim and Wehner.
This has some practical implications: the maximum basis overlap is important for things like QKD and the security of noisy storage, and this result implies that it can be tested robustly by observing a Bell inequality violation.
The kind of uncertainty relation we are interested in is of the form H(X|B) + H(Y|C) geq -log_2(c), and indeed because this is ETHZ work, we should expect a smoothed min-entropy version as well: H^epsilon_{max}(X|B) + H^epsilon_{min}(Y|C) geq -log_2(c).
Actually, we need a variant to handle the case that both bases contain a “failure” outcome. This work replaces the maximum overlap c (at least in the case of the BB84 bases) with c^* = frac{1+epsilon}{2}, where epsilon is the probability of the failure outcome. (See the paper for more general formulation.)
This parameter c^* is related to the maximum CHSH-type value beta via a nice simple formula. Unlike some previous work, it is somewhat more “device-independent”, not requiring assumptions such as knowing that the systems are qubits. beta in turn, we can relate to an entropic uncertainty relation.

Salman Beigi and Amin Gohari:
Information Causality is a Special Point in the Dual of the Gray-Wyner Region

What is information causality? Essentially the idea that the bound on random access coding should apply to general physical theories in a nonlocal setting. More precisely, Alice has independent bits a_1,ldots,a_N, Bob has b in [n], Alice sends a message x to Bob which becomes part of Bob’s state beta, and Bob tries to guess a_b. The bound is H(x) geq sum_{i=1}^N I(a_i; beta|b=i). This holds classically because Bob can guess a_1 without giving up his ability to guess bits a_2,ldots,a_n. So he can keep guessing all the others and his mutual information just adds up; but it cannot ultimately add up to more that H(x).
If you look at the details, the key ingredients are the consistency of mutual information (that is, it accurately represents correlations), data processing and the chain rule. These all apply to quantum states as well (hence the quantum random access bound).
Overview of the talk

  1. connection to Gray-Wyner problem
  2. generalizing information causality
  3. connecting to communication complexity.

The Gray-Wyner region is defined as the set of rates (R_0,ldots,R_n) such that there exists a random variable e with R_0 ge I(a;e) and R_i ge H(a_i|e) for i=1,ldots,N.
Thm: For any physical theory satisfying the above “key ingredients” plus AMI (accessibility of mutual information, to be defined later), the point
(H(X), H(a_1|beta_1, b=1), ldots, (H(a_N|beta_N,b=N)) belongs to the Gray-Wyner region.
AMI basically means that the mutual information satisfies something like HSW. There is an issue here, which the authors are looking into, with whether the Holevo information or the accessible information is the right quantity to use.
This theorem means that the information causality argument can be generalized: using the same assumptions, we also obtain all of the inequalities that are known to apply to the Gray-Wyner region. Hence the reference in the title to the dual of the Gray-Wyner region; the dual consists of all linear inequalities that are satisfied by the entire Gray-Wyner region.
This improves our understanding of these bounds. It also gives some new lower bounds on the communication cost of simulating nonlocal correlations.

Marcus P. Da Silva, Steven T. Flammia, Olivier Landon-Cardinal, Yi-Kai Liu and David Poulin:
Practical characterization of quantum devices without tomography

Current experimental efforts at creating highly entangled quantum states of many-body systems and implementing unitary quantum gates quickly run into serious problems when they try to scale up their efforts. The dimensionality of the Hilbert space is a serious curse if your goal is to characterize your device using full tomography. For example, the 8-qubit experiment that was done by Häffner et al. in 2005 took a tremendous number of measurements and many many hours of classical post-processing.
But what if you could avoid doing tomography and still get a good characterization of your device? Perhaps you are only interested in one number, say, the fidelity. Can you directly compute the fidelity without going through tomography?
Yes! And you can do it much more efficiently if you are trying to produce a pure quantum state or a unitary quantum gate.
We’ll focus on the case where the target state is a pure state for simplicity. But everything carries over directly to the case of unitary quantum gates with only minor modifications.
The fidelity between your target state psi = |psiranglelanglepsi| and the actual (arbitrary) state rho in the lab is given by F= mathrm{Tr}(psi rho).
If you expand the expression for F in everybody’s favorite basis, the Pauli basis, then you get
F = sum_i mathrm{Pr}(i) x_i where mathrm{Pr(i)} = mathrm{Tr}(psi hatsigma_i)^2/2^n is a probability distribution which depends only on the target state and
x_i = mathrm{Tr}(rho hatsigma_i)/mathrm{Tr}(psi hatsigma_i) is something which can be estimated in an experiment by just measuring hatsigma_i. (We could expand in any orthonormal operator basis, or even a tight frame; the Pauli basis is just to be concrete and experimentally accessible.) But this is just an expectation value of a random variable, = mathbb{E}(x), and moreover the variance is small, mathrm{Var}(x) le 1. To estimate the fidelity, we just sample from the probability distribution and then estimate x_i and then average over a bunch of iid samples. We only need O(1/epsilon^2) different observables to get an estimate to within additive error pm epsilon.
There are a couple of caveats. Sampling from the relevance distribution can in general be a hard problem for your classical computer (though you only have to sample a constant number of times). And the number of repeated measurements that you need to estimate the expectation value might have to scale with the dimension of the Hilbert space.
But there is some good news too. We get a lower bound on tomography which is tildeOmega(d^2) for the sample complexity using Pauli measurements for pure states, whereas the direct fidelity estimation protocol uses only $O(d)$ samples. Moreover, there are lots of classes of states which avoid the dimensional scaling, like stabilizer states, Clifford gates, or W-states. For these states and gates, the entire procedure is polynomial in the amount of time and the number measurements.

Robin Blume-Kohout:
Paranoid tomography: Confidence regions for quantum hardware
(merged with)
Matthias Christandl and Renato Renner:
Reliable Quantum State Tomography

We consider the setting of a source of quantum states and some measurements, followed by the reconstruction of the quantum state or process. How can we get useful and reliable error bars on the reconstruction?
One problem that you might have when doing a naive linear inversion of a state estimate is that you could have negative eigenvalues on the estimate, which is non-physical. How might you deal with these? One way to do that is to use maximum likelihood estimation (MLE). How might you get an error bar? You could use “bootstrapping”, which is to resample data from the initial estimate to try to estimate the variance. But this can be unreliable: you can even construct counterexamples where it fails miserably! Is there a better way? You could try Bayesian update, where you begin with a prior on state space and then computer a posterior and an error bar. Now everything that you report will depend on the prior, which might be undesirable in e.g. a cryptographic setting. So we need a reliable way to report error bars.
There are some existing methods for producing error bars (e.g. compressed sensing, MPS tomography, and others), but here we are interested in systematically finding the optimal confidence region for an estimate, one that is adapted to the data itself. It is beyond the scope of this work currently to consider also the notion of efficiency; this is an open question.
The main result: They derive region estimators such that the true state is contained in the region with high probability (over the data), and where the size of the region is, in a certain sense, minimal.
Question: how do these relate to the classical statistical notions of confidence region and Baysian credible interval? Tell us in the comments!
The authors use two different techniques to achieve these results. RBK’s results use a likelihood ratio function in the setting of iid measurements. The region is the set of states for which the likelihood function is larger than some threshold epsilon which was chosen a priori. The proof involves a classical statistical analysis. The construction by MC and RR is defined as the region in state space which contains the MLE with high probability with respect to Hilbert-Schmidt measure, but enlarged by a tiny factor. The proof in this case uses a postselection technique for quantum channels.
Here Matthias wants to give a new proof that was custom-made for QIP! It uses likelihood ratio regions for general measurements in a new way that all three authors came up with together. We can credit QIP with the result because the program committee forced Matthias+Renato to merge their talk with Robin Blume-Kohout’s. But the two papers had different assumptions, different techniques and different results! After discussing how to combine their proofs for pedagogical reasons (there wasn’t time to present both, and it also wouldn’t be so illuminating), they realized that they could use Matthias + Renato’s assumptions (which are more general) and Robin’s method (which is simpler) to get a result that is (more or less) stronger than either previous result.

Sandu Popescu:
The smallest possible thermal machines and the foundations of thermodynamics

Are there quantum effects in biology?
Biological systems are open, driven systems far from equilibrium.
Sandu: “In fact, I hope it is a long time before I myself reach equilibrium.”
This talk is somewhat classical, but it does talk about the physics of information, at small scales.
He talks about very small refrigerators/heat engines. It turns out there is no (necessary) tradeoff between size and efficiency; one can get Carnot efficiency with a three-level (qutrit) refrigerator, which is simultaneously optimal for both size and efficiency.
It was a great talk, but your humble bloggers had mostly reached their ground state by this point in the conference… see the photo at the top of the post.

QIP 2012 Day 4

Possible location of QIP 2014??

Aleksandrs Belovs
Span Programs for Functions with Constant-Sized 1-certificates, based on 1105.4024

Consider the problem of finding whether or not a triangle exists in a graph with n vertices. Naively using Grover’s algorithm requires time n^{3/2}. But a series of improvements have brought this down to n^{10/7} (using Grover with better combinatorics) and n^{13/10} (using element distinctness in a clever way). The current paper reduces this run time to n^{35/27}. That is, the exponent is improved from 1.3 to 1.296296296…
Before you roll your eyes and move to another tab, hear me out. The real improvement is in the technique, which is radically new, and has applications to many problems; not only triangle-finding.
The idea is to construct something called a learning graph. Each vertex is a subset of [n], and (directed) edges are weighted, and correspond to (sets of) queries. These are like randomized decision trees except they can have loops.
The complexity of the resulting algorithm (which we’ll describe below) is a little tricky to define. There is a “negative complexity” which is simply sum_e w_e. And a positive complexity which involves maximizing over all yes inputs x, and then minimizing over all flows of size 1 where the source is the empty set and the sinks are the 1-certificates for x.
Algorithms in this model correspond to flows of intensity one. The vertex corresponding to the empty set is the root, and 1-certificates (i.e. proofs that x is indeed a yes instance) are the sinks. The cost is sum_e p_e^2/w_e, where p denotes the flow.
From a learning graph, there is a recipe to produce a span program, and in turn a quantum query algorithm, with the query complexity equal to the geometric mean of the negative and positive complexities of the learning graph.
As an example, consider the OR function. Here the learning graph is comprised entirely of a root and n leaves. Weight each edge by 1, so the negative complexity is n. If there are k solutions, then the resulting flows are 1/k on each edge, and so the positive complexity is 1/k. The algorithm’s complexity is sqrt{n/k}. Phew!
Less trivial is element distinctness:
Are there two indices ineq j such that x_i=x_j?
Again the algorithm achieves the optimal complexity, which is O(n^{2/3}), matching Ambainis’s algorithm.
There are lots of new applications too, to improving the triangle runtime, but also getting a bunch of polynomial speedups for other problems that either are, or resemble, subgraph containment. For example, it improves some of the results from 1011.1443, and gets linear time algorithms to check whether a graph contains subgraphs from certain families, like lines, T-shaped things, and lines that go through stars.

Francois le Gall,
Improved output sensitive quantum algorithms for Boolean matrix multiplication

Boolean matrix multiplication means that the matrices are binary, + is replaced by OR, and * is replaced by AND. This arises in many cases, often related to graph theory, such as computing the transitive closure of a graph, triangle-detection, all-pairs shortest path, etc. Also it has application to recognizing context-free languages.
The naive algorithm is O(n^3). Reducing to integer multiplication gives O(n^{2.3ldots}). Quantumly, we can use Grover to compute each entry, for a total runtime of O(n^{2.5}).
But suppose C has only one nonzero entry. Then we can do recursive Grover search and get time complexity O(n^{1.5}) [Buhramn Spalek ‘05].
If it has l nonzero entries, we get O(n^{1.5}) sqrt l, neglecting polylogs.
The parameter l represents the sparsity of matrix “output sensitive algorithms”.
In this work, the runtime is improved to O(n^{3/2} + nl^{3/4}). There is also a more complicated result for query complexity. The techniques are adapted from papers of Williams & Williams, and Lingas. They also make use of a wide range of polynomial speedup tricks: Grover, approximate counting, triangle-detection, etc. The talk leaves the exciting impression that we not yet finished seeing improvements to these techniques.

Richard Cleve Discrete simulations of continuous-time query algorithms that are efficient with respect to queries, gates and space, joint work with Dominic Berry and Sevag Gharibian.

After he was introduced, Richard sheepishly admitted, “The title isn’t really that efficient with its use of space.”
Normally the oracle for a quantum query algorithm defines the oracle query by flipping the phase of certain bits. In the continuous-time setting, we can get other phases. So, for example, a ”half-query” would give Q |x rangle = i^{x_j} | x rangle, since i is a square root of minus one. In this model, we only charge 1/2 of a query for this oracle call. More generally, we can have arbitrary fractional queries. Then we can take a limit and get to continuous time to find expbigl(i(H_U + H_Q)tbigr) where H_U is the driving (gate) Hamiltonian and H_Q is a query Hamiltonian.
It is essential to be able to translate from the continuous-time model back to the discrete-time model for algorithmic applications. (cf. The work on NAND trees.) Is there a systematic way to do this translation? Yes, CGMSY-M `09 (improved by LMRSS `11) showed that the query complexity can be directly translated with the same number of queries. But these constructions are not gate efficient. Can we improve the gate complexity of the translation?
We have to be careful because the gate complexity will depend on the complexity of the driving Hamiltonian, so we have to carefully define a notion of approximately implementable Hamiltonian. If we have an epsilon-approximate implementation, and a driving Hamiltonian H, then the number of gates is at least G ge log (|H|)/epsilon, where the norm is the operator norm of the Hamiltonian (but modulo some details about the global phase.)
Main results: We can translate from the continuous-time setting to the discrete-time setting by using:

O(T log T / log log T) queries
O(T G log^2 TG) gates
O(log^2 TG) qubits of ancilla space

Miklos Santha, Hidden Symmetry Subgroup Problems, joint work with Thomas Decker, Gábor Ivanyos and Pawel Wocjan

First let’s observe an interesting connection. In the dihedral group D_{2p} for a prime p, the two-element subgroups generated by reflections coincide with the symmetries of the level sets of quadratic polynomials over mathbb{F}_p.
Can we more generally study subgroups which are hidden by symmetries? This generalizes and connects many problems in quantum algorithms, e.g. the hidden subgroup problem (HSP) over non-abelian groups, non-linear hidden structures and hidden polynomial problems among others, as the above example demonstrates.
Recall that the HSP is defined as follows. There is a function (for which you have an oracle) which is constant and distinct on the cosets of a subgroup of some group G. You want to obtain a generating set for the subgroup.
The hidden symmetry subgroup problem (HSSP): the input is a group G, a group action circ which acts on a set M, a closed set of subgroups mathcal{H}, and there is an oracle function f:M to S, such that (we’re promised) for some subgroup H in mathcal{H}, f(x) = f(y) leftrightarrow H circ x = H circ y for x,y in M. You want to output this (unique) H.

Rahul Jain and Ashwin Nayak:
A quantum information cost trade-off for the Augmented Index

Consider a twist (due to Yao ‘82) on the usual communication complexity problem. Alice has x, Bob has y, and they both want to compute f(x,y). But this time, they don’t want to leak (to the other party) more information than is already expressed by the answer, e.g. the millionaire problem.
Start with the usual example: equality testing. There is a O(log n) one-way protocol with 1/poly(n) error, and reveals only O(1) bits about one input.
Today’s talk is about:
Augmented index: Alice has x_1, ldots, x_n. Bob has k, x_1,ldots, x_{k-1}, b.
They want to know whether b=x_k.
(This has many connections to other works in theoretical computer science, not all of which are obviously communication-related. See paper for detail.)
Main result: For any quantum protocol which computes the augmented index with probability 1-epsilon on the uniform distribution, we must have either: 1) Alice reveals Omega(n/t) bits of info about x, or 2) Bob reveals Omega(1/t) about k, where t is the number of messages. This is true even when restricted to 0-inputs.
There is also an earlier classical result by the same authors: Alice reveals Omega(n) or Bob reveals Omega(1). If there are only two messages, then the lower bound for Bob is improved to Omega(log n) (which is optimal), even when restricted to 0-inputs.
At this point, Ashwin, apparently reading the audience’s mind, puts up a slide that is not quite as pertinent as Josh Cadney’s opening question, but is still what we are all thinking:
“Why Augmented Index? Why privacy w.r.t. 0-inputs?”
Answer: Suppose we care about a streaming algorithm, which for quantum is relevant because we might only have O(1) qubits for the foreseeable future. There are exponential space improvements in this model, but let’s just say they don’t look like the sort of problems that Cisco was dying to implement on their routers. Augmented Index, by contrast, is much more natural. One semi-related version is Dyck(2), which is determining whether a sequence of (,),[,] is properly bracketed (this is a special case of recognizing context-free languages). Here we observe a big separation between one and two passes over the input (if the second pass goes in reverse order), and the lower bound on the (classical) one-pass complexity is based on (classical) lower bounds for the augmented index problem.
Having motivated the question, Ashwin is left with “-2 minutes” to finish his talk. Talk about a cliffhanger! Now he’s guaranteed to have the proof accepted to QIP 2013 (although he did give away the punchline by stating the theorem at the start of his talk). Moreover, the chair apparently has a sign for -2 minutes left, since he didn’t say anything, he just held up a sign. I guess staying on time is not a strong point in our community. 🙂
Ashwin is now ruining his cliffhanger, giving an intuitive and cute overview of the proof! He claims that this part of the talk is “in reverse.”

Sevag GharibianHardness of approximation for quantum problems, joint work with Julia Kempe

Finding the ground state energy of a local Hamiltonian is QMA-complete, and this is harder than NP, because, um, there’s quantum in there too, and well, it seems harder? Sev wants to say something more concrete here. We’ve given up on solving local Hamiltonian exactly in general, because we’ve already given up on solving NP on a quantum computer (mostly). But surely there are good approximation algorithms for these things. Right? Well, today, we’ll talk about hardness of approximation, thus casting complexity-theoretic cold water on seemingly less ambitious goals.
All we know about approximating the local Hamiltonian problem:

  • PTAS on planar graphs
  • 1/d^{k-1}-approximation algorithm for dense graphs with k-local Hamiltonians.

This talk takes a step away from QMA, and considers instead the complexity class cq-Sigma_2, which is the set of problems that can be expresses as “there exists x, such that for all |psirangle, a poly-time verifier V accepts (x, |psirangle).
Their main result is to show that the following problem is complete for cq-Sigma_2, and the hardness part holds even if we demand only approximate solutions.
The quantum succinct set cover problem: Given a set of 5-local Hamiltonians acting on n qubits, and a parameter alpha>0, and the promise that
sum_{iin S} H_i succeq alpha I,
find the smallest subset S'subseteq S such that
sum_{iin S'} H_i succeq alpha I.
The diagonal case was shown to be hard to approximate by Chris Umans in 1999.
For the technical details, there are lots of nice tools, and you’ll just have to look at the paper (when it comes out…).

Jérémie Roland,Quantum rejection sampling, joint work with Maris Ozols and Martin Roetteler

Let’s first review classical rejection sampling, which is a tool that you can use to sample from arbitrary probability distributions. It has numerous applications: Metropolis sampling, Monte Carlo algorithms, optimization algorithms (e.g. simulated annealing) and many others. In the quantum case, we wish to replace probabilities by amplitudes.
Consider two probability distributions P and S. Suppose we can sample from P repeatedly, but we wish to sample from S. Define Pr[{rm accept} k] = gamma s_k/p_k and continue to sample until you accept. Here we will choose the parameter gamma in a way to try to minimize the expected amount of sampling from P that we need to do. The best way to do it is to choose gamma = min_k p_k/s_k. Then the expected number of samples from P to get a sample from S is 1/gamma.
Now for the quantum case. We are given access to a black box which prepares a state |pi^xirangle = sum_k pi_k |xi_krangle |krangle where the pi_k are known amplitudes. We want to prepare the state |sigma^xirangle = sum_k sigma_k |xi_krangle |krangle in terms of some new amplitudes. What is the randomized query complexity of this?
Instead of computing a function, which is the normal thing that we use oracles for, we wish to generate a quantum state. There is an oracle, which is a unitary that hides the label xi in a non-explicit way. The action of the oracle is O_xi |0rangle = sum_k pi_k |xi_krangle |krangle.
Let’s define quantum state preparations:
We have a set of quantum states Psi = {psi_xi}
set of oracles O = { O_xi }
Goal: generate psi_xi given black box access to O_xi
What’s the minimum number of calls to the oracle to get the correct state (up to a coherent error with amplitude sqrt{epsilon})?
We introduce a coherent coin.
O_xi |0rangle to sum_k pi_k |xi_krangle |krangle (sqrt{ pi - alpha})|0rangle + alpha|1rangle)
This rotated ancilla is like a classical coin, but now we have a superposition. If we measure the ancilla then we accept if we get one.
Using amplitude amplification, we can get a speedup over the naive accept probability.
We can also optimize over the rotation angle to get the optimum. This can be done efficiently since it’s an SDP. They can prove that this is an optimal algorithm. [photo 70]
Now for the applications. Linear systems of equations: the HHL algorithm basically just does phase estimation followed by quantum rejection sampling. [photo 71] The quantum Metropolis algorithm is a way to prepare a Gibbs state of H at inverse temperature beta.
Direct mapping of Metropolis MRRTT algorithm OK for diagonal Hamiltonians, but if Hamiltonian is not diagonal we want to avoid the work of diagonalizing it.
One of the reasons that the Q. Metropolis algorithm was difficult was that the “reject” moves required that you go back to a state that you just measured. You can avoid having to deal with this difficulty if you use quantum rejection sampling. One can amplify the accepted moves and therefore avoid having to revert moves at all!
Boolean hidden shift. f(x+s) = f(x) find s in Z_2^n. If the function is a delta function, then you can basically only do Grover. You can do bent functions (those with flat Fourier spectrum) with only 1 query [Roetteler 10]. These are the extreme cases, but what happens when we interpolate between them? We can use quantum rejection sampling to analyze this problem.
Other applications: Amplify QMA witnesses [MW05, NWZ09], prepare ground states of PEPS parent Hamiltonians on a quantum computer [STV11]. [photo 76]

Rolando Somma, Spectral Gap Amplification, joint work with Sergio Boixo

Preparation of eigenstates is a central problem and fast quantum methods for computing expectation values of observables leads to speedups for many problems. Let’s begin by considering adiabatic state transformations. The run time is governed by the spectral gap of the interpolating Hamiltonian. The generic cost C(T) of a quantum algorithm that prepares the eigenstate depends on the inverse power of the spectral gap.
given H with eigenstate psi and gap Delta, can we construct an H with same eigenstate psi but a larger gap? (without cheating by multiplying by some constant?) We want to consider that the Hamiltonian is a sum of terms which are, say, local. We have a black box which on input k, s evolves according to e^{-i s H_k}. How many times do we need to use this box? We require that the cost of evolving with H’ is the same.
They achieved a quadratic gap amplification for the following case: if H is frustration free, then the new gap is Omega(sqrt{Delta/L}) where L is the number of terms in the Hamiltonian. In particular, when the spectral gap is very small in terms of L this gives an improvement. It turns out this amplification is optimal.
Question: What prevents you from iterating amplification process?
Ans. The new Hamiltonian isn’t FF, so you can only do the amplification once.
S. Flammia: Optimality of results?
Ans. There may be better constructions with respect to the scaling in L.

Rajat Mittal, Quantum query complexity for state conversion, joint work with Troy Lee, Ben Reichardt, Robert Spalek and Mario Szegedy

We live in a quantum world, so why are we always trying to solve classical tasks, like computing f(x), for x specified by an oracle? How about trying to prepare |psi(x)rangle given x specified by an oracle? Or even more generally, converting |psi_1(x)rangle into |psi_2(x)rangle? To make things simple, let’s focus on creating |psi(x)rangle, although the results apply to more general state conversion problems as well. The difficulty of doing this is defined entirely by the Gram matrix G_{x,y} = langle psi(x)|psi(y)rangle.
Previous work on adversary bounds gave a tight characterization of quantum query complexity for boolean functions. The present work extends the result to general functions, and indeed to state conversion.
Ok, what is this thing gamma_2(M|Delta)?
Without the Delta, we define $gamma_2(M)$ to be the minimum over factorizations {u_x}, {v_y} (meaning that M_{x,y} = langle u_x, v_yrangle of max |u_x|,|v_y|
w.r.t. Delta; it’s similar, but with M_{x,y} = langle u_x, v_yrangle N_{x,y}.
e.g. gamma_2(M|J) = gamma_2(M) and gamma_2(M|M)=1.
This gives some nice corollaries, e.g. that function composition does what you’d expect to query complexity.

Fernando Brandao, Random quantum circuits are unitary polynomial designs, joint work with Aram Harrow and Michal Horodecki

Previously blogged here.
Patrick Hayden: To fool a circuit takes more time than the circuit does. Can you introduce some kind of restriction to make it go in a more favorable direction?
Aram: We could try to fool quantum Turing machines, but I don’t think our techniques say much that’d be helpful there.
Renato: Could you replace a random circuit by a random Hamiltonian?
Ans. Interesting question. Hayden et al do something like that. More interesting is to…


The headline stat: 42 papers were accepted out of 198 submitted, meaning a 21% acceptance rate. (Louis also said the “real” rate was 29%.)
There were 485 reviews, of which 104 are from 72 external reviewers.
There were 198 = 144 (early) + 54 (late) poster submissions.
Of the 42 talks, there were 4 plenary, 9 long, 27 regular and 2 merged.
The budget breakdown is: rental+technical 55k, food+drinks 90k, personnel 30k, student support 10k for a of total 185k.
There were 259 participants of which 113 students.
Thus conference fees were 105k and there was 85k from sponsors. This leaves a surplus of 5k, presumably to cover damages during the rump session.
For the locals, the conference booklet explains how to get to the rump session. You take the metro, and as Louis said, “you will visit a lot of stations.” Note that this post will go online after the rump session.
Also, where is QIP 2013???
BEIJING! (specifically, Tsinghua University)
Local organizers are: Andy Yao (general chair), Amy Wang (organizing chair), Giulio Chiribella, Luming Duan, Kihwan Kim, Jian Li, Yaoyun Shi, Shengyu Zhang.
There is a new CQI at Tsinghua. You can guess what it stands for. Note that few permutations of those three letters remain.
Tentative dates are: Jan 21-25 or Jan 28-Feb 1, 2013.
A truly open question is: where will QIP 2014 be?
Two tentative proposals on the table are IBM Yorktown Heights and Barcelona.
But volunteers are still welcome! If you want the experience of hosting QIP, send an email to Louis Salvail soon.
Possible locations include a science museum, a university campus and… La Pedrera? Seriously? (See picture at the top of the post.)
Then there was general discussion about QIP.
Suggestions include

  • not posting the long versions (at least without the authors confirming)
  • Renaming QIP Workshop -> QIP Conference.
  • Getting rid of the 3-page submissions? Here was the heated discussion. There always has to be a heated discussion about something.
  • Dec vs Jan? The audience seemed to overwhelmingly prefer January (among those who cared). And yet here they are, all here in December!

QIP 2012 Day 3

PSPACE is a black hole gobbling up the other complexity classes.
And you thought the LHC was dangerous!

Troy Lee and Jérémie Roland:
A strong direct product theorem for quantum query complexity

Suppose you want to solve k independent instances of a problem with k-fold as many resources. For example, say you want to juggle 4 balls with two hands vs 2 balls with one hand.
A Strong Direct Product Theorem (SDPT) says (roughly) the following:
If we require r resources to solve one instance of a problem with success probability p, then using fewer than kr resources to solve k instances cannot succeed with probability greater than pk.
Intuitively, it ought to hold in many situations, but has been proven to hold only in a few. Some applications of SDPT:

  • Making a hard function really hard, e.g. Yao’s XOR lemma
  • Amplify soundness of a protocol, e.g. Raz

Here they study Quantum Query Complexity, which uses a black box oracle that can be queried in superposition, as a proxy for quantum time complexity. It behaves so well that it scarcely deserves to be called a computational model.
The big picture is that they use adversary methods and the state generation formalism due to Ambainis. The method defines a type of potential function and then bounds the rate of change from one query to the next. In the additive setting, they bound the difference, and in the multiplicative setting, they bound the ratio. The state generation method has the advantage that it allows the separation of lower bounds into two parts: the bound on the zero-error state generation and the minimization of this bound over the valid final Gram matrices of a successful protocol.
It is known that the multiplicative adversary method characterizes bounded-error quantum query complexity since the multiplicative adversary is at least as large as the additive one. One of the achievements of this paper is that they show that the multiplicative adversary method still works when the convergence ratio is strictly bounded away from one. This immediately implies a SDPT by the results of Spalek `08, but they reprove the SDPT from scratch so that they can get a better optimization of the parameters.
Conclusion and open problems: For Boolean functions, they show an XOR lemma. Does SDPT hold for all state generation problems? Is the multiplicative adversary method also tight in the case of large error?

André Chailloux and Or Sattath:
The Complexity of the Separable Hamiltonian Problem

First recall that QMA is the quantum analog of NP: it’s the set of problems which can be verified efficiently on a quantum computer when given a quantum proof (i.e. a quantum state for a witness). For QMA wth 2 provers, denoted QMA(2), there are two unentangled states which are given as proofs. It’s similar to interrogation of suspects: Author can play the two Merlins against each other and it is probably a much stronger class. (Each Merlin still sends poly(n) qubits, so QMA(2) is at least as strong as QMA, but is probably stronger.) Pure n-representability is known to be in QMA(2) but not known to be in QMA. It’s known that QMA(k) = QMA(2) for all k>2. We know only that QMA(2) subseteq NEXP; this seems like a pretty weak upper bound!
The main result of this paper is to find a complete problem for QMA(2). The separable local Hamiltonian problem is actually in QMA, but the separable sparse Hamiltonian problem is QMA(2) complete.
The problem: given a Hamiltonian H = sum_i h_i where each term is either local or separable, then decide whether there exists a separable state with energy at most a or if there is no state with energy less than b. (Where b-a > 1/poly(n) is the promise gap.)
Separable Local Hamiltonian: Given a local Hamiltonian H find a separable (say across an n by n qubit cut) rho that minimizes mathrm{tr} H rho.
This sure doesn’t look like it’s in QMA!
If the prover sends rho, there’s no way to verify that it is a product, because you can’t test entanglement with a single copy.
Instead, the prover sends rho AND classical descriptions of all k-qubit marginals (k comes from H being a k-local Hamiltonian). The consistency of these marginals is in QMA; the witness is simply rho. (In fact, this problem is QMA-complete.)
Separable Sparse Hamiltonian: A totally different story! Now it’s QMA(2) complete. (The first known such problem.) Immediately we see that the qubit structure doesn’t exist, so we can’t rely on the strategy of looking at marginals. Clearly the problem is in QMA(2). To show completeness, we adapt Kitaev’s Hamiltonian for encoding a circuit (his so-called clock construction.) One challenge is that if the QMA(2) verifier makes an entangled measurement, then the history state will be entangled, which is not of the form we want. We want to simulate a computation in which the input is entangled, but we don’t care about the later steps; however, we are given the ability to minimize the energy of the sparse Hamiltonian for separable states, period.
The trick is that 1001.0017 shows that QMA(2) = QMA(2)-SEP, which is defined to be the class where the verifier is restricted to making separable (i.e. non-entangling measurements, at least for the yes outcome). This measurement requires a controlled-SWAP on the two proofs, which interestingly, is sparse, but not local. You might think you can build this out of a bunch of small controlled-SWAPs, but this would produce entanglement at intermediate steps. SWAP has a complicated relationship with entanglement. It doesn’t entangle product states, but only if it’s applied to the entire state; applying it to subsystems can create entanglement.
Many of the same open questions remain, about the status of QMA vs QMA(2) (although there is evidence from the case of log-size proofs to suggest that they are different), and about whether pure N-representability is also QMA(2) complete.

Yaoyun Shi and Xiaodi Wu:
Epsilon-net method for optimizations over separable states

The QMA(2) session continues! Once upon a time, I (Aram) thought QMA(2) was the kind of thing that complexity theorists make up to have more things to talk about, but I’ve since made a total U-turn. The problem, especially it’s log-size-witness variant, is equivalent to a ridiculously large number of problems, like separability testing, mean-field Hamiltonians, computing minimum output Rényi entropies of channels, finding the largest singular value of tensors, estimating the 2rightarrow 4 norm of matrices, and many others. So it’s exciting that the only dimension-independent hardness result we know requires quantum techniques (1001.0017). And also the only (classical) approximation algorithm! (1011.2751)
This talk changes this, with a pair of elementary approximation algorithms, one of which matches the performance of a weaker variant of 1010.1750 (details below).
To understand the basic problem, we don’t need to think about provers or circuits. The problem is as follows: Given a Hermitian matrix H on d^2 times d^2 dimensions, estimate {rm OptSep}(H) := max { {rm tr} H rho : rho in {rm Sep}(d,d)}, where {rm Sep}(d,d) denotes separable states with d dimensions for each party. This is a convex optimization problem, but it’s hard to test membership in {rm Sep}(d,d); indeed, this has equivalent difficulty. Alternatively, we know that the maximum is achieved for a rho which is a pure product state, where membership is easy to test, but this set is non-convex.
It is known that performing this optimization up to 1/poly(d) accuracy is NP-hard. (This result is originally due to Gurvits, although that’s not so widely known, since most people only noticed the part of his paper which talks about the hardness of testing membership in {rm Sep}(d,d).) Constant accuracy is only known to be as hard as solving a 3-SAT instance of length log^2(n).
This talk gives two algorithms for approximating OptSep.
One works well when H is nearly product; in particular, when H can be decomposed as H=sum_{i=1}^m A_i otimes B_i. The algorithm is to compute the joint numerical range of the A_1,ldots,A_m and B_1,ldots,B_m, meaning the sets of possible ({rm tr}rho A_1, ldots, {rm tr}rho A_m) and of possible ({rm tr}rho B_1, ldots, {rm tr}rho B_m). If m is not too big, then this is reasonable. Moreover, we can enumerate over the corresponding epsilon-nets with only a small amount of space. This implies that when QMA(2) is restricted to nearly product measurements (i.e. m leq mathrm{poly}(n)), then the resulting class is contained in PSPACE.
The second algorithm does eigenspace enumeration directly on H, and achieves additive error delta in time {rm poly}(d) exp(|H|_2^2delta^{-2} log(|H|_2/delta)). This matches one of the corollaries of 1011.2751, but not their main result.

Abel Molina and John Watrous:
Hedging bets with correlated quantum strategies

Alice and Bob are at it again, playing a nonlocal game. Here is the framework: Alice prepares a question and sends it to Bob, Bob responds by sending an answer to Alice. Based on Bob’s answer, as well as whatever memory she has of her own question, Alice decides whether Bob has passed or failed a test.
Alice’s actions are specified by a density operator and a two outcome POVM. We assume that Bob knows a complete description of Alice. Then the games that they consider here can be formulated as a semidefinite program to determine Bob’s optimal success probability for passing the test.
Let’s consider parallel repetition: Alice performs two tests independently, but Bob can correlate his strategy in an arbitrary way. Suppose that Bob’s optimal probability to pass a single instantiation of a given test is p, and Alice independently runs the test twice. There are several natural questions: What is the optimal probability for Bob to pass both or at least one test? To pass k out of n independent tests? If Bob uses an independent strategy, you can just calculate the success probability of that. Can he do better by correlating his strategies?
If Alice’s test is classical, then Bob gains no advantage. In the quantum setting, Bob’s optimal probability to pass both tests is again p2, so he gains no advantage by correlating the tests. Both facts can be proven using SDP duality.
It would be natural to conjecture that it is also optimal for Bob to play independently if he only cares about passing at least one test. But this turns out to be false. They have a simple counterexample to this natural conjecture, and they compute Bob’s optimal strategy for a simple nonlocal game. The strategy is a form of hedging.
This work is in a different setting than other results with non-classical correlations of measurement outcomes. It might give insight into possible attacks in cryptography, where breaking the system only once is all you need: you don’t try independent attacks, instead you entangle a joint attack! It also gives reduced errors in quantum interactive proof protocols. So far, they have proved some partial results along these lines.
Comment by Richard Cleve: There is a classical strategy for hedging in the CHSH game which can perfectly win 1 out of 2 parallel games.

Jop Briet and Thomas Vidick:
Explicit lower and upper bounds on the entangled value of multiplayer XOR games

Bell’s theorem says that the set of correlations P[ab|xy] achievable quantumly is bigger than the set achievable classically. For this talk let’s focus only on P[aoplus b | xy]. Let’s look quantitatively at how the quantum advantage depends on parameters like the number of questions, the number of answers, the number of players, and the amount of entanglement (could also consider the type). Boosting this advantage is experimentally relevant to prove wrong all those quantum-mechanics deniers out there. You know the type.
For bipartite entanglement, we understand the situation pretty well now, but what about multipartite? (Another motivation is that quantifying multipartite entanglement is tough because it’s not clear what the goals are: this gives a nice clear way of keeping score.)
For a desired bias ratio R (meaning the entangled value is R times the unentangled value).

upper bound (exists game)

lower bound (Best possible)

min questions



min local dimensions



Here’s the outline of the proof.

  1. For any matrix (of appropriate size) we construct a 3-player XOR game.
  2. Relate R to spectral properties of matrix.
  3. Existence of a good matrix that gives a large separation

And the details.
1. Given R, choose n such that 2^{n/2}=R.
Give each player n qubits.
The game is to choose Paulis P,Q,R with probability |{rm tr}M(Potimes Qotimes R)|^2 (suitably normalized). And ask the players to output bits whose XOR gives the sign of {rm tr}M(Potimes Qotimes R).
2. The entangled bias is at least the largest eigenvalue of M.
The classical bias is the 2-2-2 norm, = max langle M, A otimes B otimes Crangle where the max is taken over A,B,C with Frobenius norm leq 2^{n/4}.
3. Choose M randomly of course! Might as well make it a rank-1 projector, so the largest eigenvalue is 1, but other norms, like the 2-2-2 norm, are likely to be small.
Some open questions: They leave open a gap of size R2 in the number of questions needed for an R-sized bias ratio. It would be nice to get a tight analysis. Another open question is to derandomize their construction.

Gus Gutoski and Xiaodi Wu:
Parallel approximation of min-max problems
with applications to classical and quantum zero-sum games

Everyone knows that QIP=PSPACE. But did you also know that SQG=QRG(2)=PSPACE? (Short quantum games, and two-turn quantum refereed games, of course.) All of these follow from using the multiplicative weights update method to solve large SDPs in parallel. (Here is a review of the multiplicative weights update method that doesn’t discuss quantum applications, and here is a review that does).
This work concerns 2-turn quantum refereed games. Some motivation comes from complexity theory. We can consider interactive proof systems with two competing provers, one trying to prove Yes and the other trying to prove No.
The key result is that SQG is contained in PSPACE thus unifying and simplifying previous results about various games and interactive proofs being equivalent to PSPACE.
Additionally, they showed that public coin RG is not equal to RG unless PSPACE = EXP. Thus, the case of two competing provers is different from the case of a single untrusted prover (the usual IP situation) in which the messages to the prover can be replaced by public coins.
The key technical result is a way to parallelize an SDP related to whether certain quantum states exist. SDPs in general cannot be parallelized unless NC=P, but this result is possible because it deals with normalized quantum states going through trace-preserving quantum channels, and so the multiplicative weights update method (MWUM) can give a good approximation. Indeed MWUM has previously been used to quickly find solutions to classical zero-sum games (and convex programs in general), so it is natural to examine it for quantum zero-sum games. And even though there may be many rounds of interactions, these can be modeled by a series of linear and semidefinite constraints, applied to the density matrices appearing at various intermediate steps. This “transcript representation” was originally invented by Kitaev (whose ghostly presence looms large over this entire conference) in his work on coin flipping.
They also discussed penalization and rounding theorems and space efficiency.
The best part was a nice artistic depiction of PSPACE as a black hole gobbling up the other complexity classes, as shown at the top of the page!

QIP 2012 Day 2

Quantum Information Enters the Third Dimension

Itai Arad, Zeph Landau and Umesh Vazirani:
An improved area law for 1D frustration-free systems.

Some motivation: which quantum many-body systems can be simulated efficiently on a classical computer? Which states even have an efficient classical description (even if we might have trouble finding that description)? Can we efficiently describe their entanglement? These questions are important both theoretically and practically.

In this talk, we’ll address this question specifically focused on the case of ground states of gapped local Hamiltonians.
We care about this special case for reasons related to both physics and computer science. Local Hamiltonians are ubiquitous in condensed matter physics, and are the natural quantum version of constraint satisfaction problems in complexity theory.
For a state to be “efficient” in a meaningful sense, we should not just have a poly-sized classical description of the state. (For example, the local Hamiltonian itself implicitly gives such a description!) We should be able to efficiently calculate the expectation values of local observables.
For an arbitrary pure state, the Schmidt rank (mathrm{SR}) or the von Neumann entropy (S) can be as large as the total number of spins in one half of a bipartition. In condensed matter systems we expect that these measures of entanglement don’t have a volume scaling, but rather, a scaling proportional to the number of spins on the boundary of the region defining the bipartition. This is what is meant by an area law.
Why might you expect that an area law holds? One motivation is the exponential clustering theorem (Hastings & Koma `06, Nachtergaele & Sims `06) which says that gapped systems have connected correlation functions which decay exponentially. Namely, if |Omegarangle is a ground state of a gapped system, then for local observables with disjoint support we have

| langle ABrangle -langle Arangle langle B rangle | le c(A,B) expbigl(-mathrm{dist}(A,B), O(Delta E) bigr)

Here the constants depend on the strength of interactions and the norms of the local observables.
But correlation is not the same as entanglement! For example, there
are data hiding states (Hayden et al. `04, Hastings `07). Bouding
the entanglement length is much harder than simply bounding
correlation because, in a sense, entanglement can “sum up” a bunch of weak correlations and possibly add up to something large.
The area law has only been proven rigorously for 1d systems (Hastings `07) and it is open in other settings. For a 1D system, the area law means that the entropy of any contiguous block is bounded from above by a constant. Some implications: the ground state has an efficient description in terms of an MPS and it explains why DMRG works so well in practice.
What about dimensions higher than one? The constants in Hastings’ proof are exponential in X = log(d)/Delta E. (Here d is local site dimension.) The best lower bounds (Hastings & Gottesman `09, Irani `09) are only polynomial: S ge Delta E^{-1/4}. Improving these constants in the 1d proof might be able to give nontrivial bounds in 2d by course graining one of the dimensions, so progress on these constants could have highly nontrivial implications.
The new result, an upper bound which has only polynomial dependence on X. The new bound is, up to polylog factors, S le X^3.
The proof techniques use Chebyshev polynomials an they might be useful elsewhere. They assume the Hamiltonian is frustration free, but this probably can be extended to the more general case of frustrated gapped local Hamiltonians.
Outline of the proof
The general strategy: start with a product state and approach the ground state without generating too much entanglement. For example, let’s assume that the projection to the ground state doesn’t create too much entanglement, as quantified by the Schmidt rank: mathrm{SR}(Pi |phirangle) le D, mathrm{SR}(|phirangle) for some D. Then we could get a bound of S le log D in terms of the von Neumann entropy.
We will have to use an approximate ground state projector (AGSP). It needs to have three properties (here |Omegarangle is the unique ground state):

  1. K | Omega rangle = |Omegarangle
  2. | K |Omega^perprangle|^2 le Delta | |Omega^perprangle|^2
  3. mathrm{SR}(K|phirangle) le D, mathrm{SR}(|phirangle)

Three factors determine the bound on S: D, which measures how much entanglement we create; the shrinking factor Delta, which determines how fast we approach the ground state with each successive application of K; and mu, the overlap between the initial product state and the ground state.
Lemma: S(Omega) le O(1) dfrac{log mu}{log Delta} log D .
When D Delta < frac12, then there is a product state with big overlap with the ground state (mu isn’t too small). This is the bootstrap lemma. As a corollary, then for any AGSP we have an area law.
Now to prove the area law, we need to actually find a good AGSP. If it is frustration free, we can consider local projectors w.l.o.g. Then we can use the detectability lemma to get the AGSP.
The idea of the detectability lemma is to split the projectors into two layers, the product projector onto the even and odd terms in the Hamiltonian, respectively. Let A = Pi_ePi_o. Then we have

|A |Omega^perprangle|^2 le (1+Delta E/2)^{-2/3} | |Omega^perprangle|^2.

Therefore, A is an AGSP and the Schmidt rank increases by at most D = d^2 and shrinkage factor satisfies Delta = (1+Delta E/2)^{-2/3}. But we need a better AGSP to get an improvement to the area law.
Main idea: “dilute” A a bit so that we don’t let the entanglement accumulate too much. Can we dilute A so that the Schmidt rank decreases, but without worsening the shrinking factor by too much?
We define a new operator mathbb{N} = sum_i Q_i. It counts the number of constraint violations in a particular layer. It is a polynomial of degree m (the number of terms in the layer). Can we approximate it with a lower degree polynomial? Yes, and this is where the Chebyshev polynomials enter. Taking the Chebyshev polynomial of degree sqrt{m} gives a good approximation, but with a much lower degree. But now the number of terms is large.
We have to bound the number of terms in the sum. The bound is t = expbigl(O(log^2(q sqrt{m}) ellbigr). The overall Schmidt rank is at most D^ell.
Now we have to glue everything together. We still have to initially course grain the system a bit to deal with the case when Delta D > 1. (Recall, we need this product to be less than one half in order to get the first lemma to work).
Open questions: efficient representation for ground states of gapped local Hamiltonians in higher dimensions? Area law for higher dimensions? Can the proof be generalized to the frustrated case? Can this idea of “dilution” be used elsewhere?

Spyridon Michalakis and Justyna Pytel:
Stability of Frustration-Free Hamiltonians

I already blogged about this work (twice!) so I’ll just refer to that.

Toby Cubitt, Martin Schwarz, Frank Verstraete, Or Sattath and Itai Arad:
Three Proofs of a Constructive Commuting Quantum Lovasz Local Lemma

The Lovász Local Lemma (LLL) says that if several events are “mostly independent”, then the probability that none of these events occurs is still strictly positive. The constructive version of the LLL, whose most recent instantiation is due to Moser and Moser & Tardos, shows that the simplest algorithm you can imagine converges to a solution: just start with a random assignment and then check random constraints which are violated, fix them with a random assignment, and repeat until you reach a satisfying assignment. The quantum version of the LLL replaces classical “events” with projectors, but the current version is nonconstructive. Can we find a constructive proof in the quantum case?
There are two challenges: entanglement and the measurement-disturbance tradeoff. This talk will focus on the challenge of entanglement, since this challenge is present even in the presence of commuting Hamiltonians.
In the simplest generalization of Moser & Tardos (probabilistic version) to the quantum case, we just replace local qubits with the maximally mixed state. This is the Verstraete et al. `09 dissipative map, so we know it converges to the correct state, but we can’t (yet) bound the convergence rate.
Key lemma: We can bound the probability of generating a particular sampling sequence, i.e. a particular directed acyclic graph (DAG). Next, we can bound the expected number of violations, then use this as a subroutine in an efficient algorithm. This allows us to bound the runtime.
Second approach: use the original entropy compression argument of Moser. This goes through without too much effort. The algorithm halts in expected linear time.
What about the noncommutative case? Now there are two complementary problems: in the first proof, we can no longer bound the runtime. In the info-theoretic second proof, there is no guarantee that it halts on the correct state (though it definitely will halt). There are two conjectures which, if true, would enable a noncommutative constructive QLLL. See the paper for details.
Toby, you went way too fast for me to keep up, sorry.

Norbert Schuch:
Complexity of commuting Hamiltonians on a square lattice of qubits

I also blogged about this one previously.

Josh Cadney, Noah Linden and Andreas Winter (contributed talk):
Infinitely many constrained inequalities for the von Neumann entropy

The speaker addresses a central question which affects all of us: Why is this talk better than lunch? Answer: it’s different; it’s introductory.
The von Neumann entropy satisfies many inequalities, for example, positivity and the strong subadditivity inequality. In this talk, we are interested only in inequalities based on reduced states that are dimension independent.
Define the entropy vector: for any subset of systems alpha, we define reduced state on alpha, and the associated entropy. Stack all of these entropies into one big vector. If we fix the number of parties to be n, then we let Sigma_n be the set of all entropy vectors of n-partite states. We don’t make restrictions on the local Hilbert space dimension. We want to characterize this entropy vector (as a function of n). Since the von Neumann entropy tells us about bipartite entanglement, than this might tell us something about multipartite entanglement.
If we look at all Sigma_n, then we get a convex cone once we take a closure. Call this the entropy cone. What sort of inequalities do we know which characterize this entropy cone?

  • For one party, we know positivity.
  • For two parties, we know subadditivity and triangle inequality.
  • For three parties, we know strong subadditivity and weak monotonicity.

Do these existing inequalities tell us everything we need to know about the entropy cone? To check this, we look at the extremal rays and try to find an entropy vector there. For n=2, we find that there are no more inequalities: the above list is exhaustive. For three parties, we find four families of rays, and again there are no more inequalities.
For four parties, we can continue this program. The extremal rays are stabilizer states! Are there any more four party inequalities? We don’t yet know. To help address the question, we first look at the classical case (Shannon entropy). In the classical case we additionally have (non-weak) monotonicity. Playing the same game in the classical system, we learn from previous work that the four-party cone isn’t polyhedral: there are an infinite number of entropy inequalities.

New result: given a constraint on the mutual information I(A:C|B) = I(B:C|A) = 0, then there are new inequalities, and additional inequalities each time we add a new party. See the paper for details. The proof elaborates on a classical proof by Makarychev et al. from `02. Each of the inequalities is independent: every new inequality sharpens the bound on the entropy cone in a different way than the others. They have not been able to violate a single one of the new classical inequalities numerically.

Jeongwan Haah:
Local stabilizer codes in three dimensions without string logical operators

The general problem is to find a noise-free subspace/subsystem of a physical system on which manipulations are still possible. For a many-body system, local indistinguishability (aka topological order) is stable at zero temperature, but might be fragile at nonzero temperature. Here we are focusing on the case of memory only (so, only in the ground state).

First a review of the toric code. Because the logical operators are string-like, the toric code is not stable at nonzero temperature. This talk will try to build a code which doesn’t have this drawback.

The code is defined on a cubic lattice with 2 qubits on each vertex. There are no string-like logical operators in this model. This system has an exotic ground state degeneracy which depends on the system size.

Some general considerations:

  • Simple cubic lattice with m qubits per site
  • t types of cubic interactions with code space dimension > 1
  • No single-site logical operators
  • No obvious string operator

To simplify things:

  • t=2: one for X and one for Z type Paulis
  • Product of all terms in the Hamiltonian should be Id
  • Logical operator on a site should be Id
  • Logical operator on a straight line should be Id

These conditions can be translated to linear constraint equations on the symplectic vector space of the Abelianized Pauli group. There are 64 commutation relations on a single unit cell of a cubic lattice which determine everything and there are 27 independent linear constraints (from the above considerations). There are 37 free variables, and 30 remaining equations. If you brute force solve everything, then you find there are 17 solutions.
The first code has a Z<–>X symmetry and looks like this:
Some questions: Is this model topologically ordered? What do you mean by a “string”?
To answer the first condition, use the local topological order condition from Bravyi, Hastings, Michalakis. That is, observables with local support should not be distinguishable in the ground state. One can check this condition and it is satisfied. The computation simplifies if you take advantage of the symmetries.
Let’s get a rigorous definition of “string”. This is a key contribution of this work. A string is a finite Pauli operator which creates excitations at at most two locations. Define an “anchor” to be an envelope around the excitations. The “width” is the size of the anchors, the “length” is the distance between the anchors. There are no geometric restrictions. We need to tweak this definition a bit: we need a notion of a “trivial” string segment to rule out some annoying counterexamples.
The no-strings rule: String segments of width w and length L > c w are always trivial (where c is some constant). The above code satisfies the no-strings rule with c=15.
One can prove this using the same techniques developed above to check the local TQO condition. Jeongwan calls this the eraser technique.
To understand what the no-strings rule means, you have to develop some algebraic tools. Physically, it means you can’t drag defects; you can only annihilate them and then recreate them elsewhere. Jeongwan spent some effort describing these tools, but it was way too much for me to write all of it down (well, at least if I wanted to understand any of it). You’ll have to look at his papers for details.
Jeongwan specifically showed how both the local TQO and no-strings rule could be cast in the language of algebras of polynomials to give simple proofs for the case of his codes.

Andrew Landahl, Jonas Anderson and Patrick Rice:
Fault-tolerant quantum computing with color codes

Haah’s nice code makes me want to think more about 3d
But my 3d glasses here (puts on 3d color glasses) can make 2d color codes look 3 dimensional (see top of post.)
Consider color codes. One of the codes (the square-octagon lattice code, aka the 4,8,8 lattice) has a transversal S gate.
The control model: there is a faulty gate basis consisting of X, Z, H, S, S^dagger, CNOT, measurements on X and Z, and state preparations on 0, + and pi/4. Additional assumptions: parallel ops, refreshable ancillas, fast classical computation, equal-time gates.
Locality assumptions: 2D layout, local quantum processing.
We also consider two error channels: bit flips and double Pauli (DP).
Consider three models of error:
Circuit based noise model:

  • each preparation and one-qubit gate followed by BP(p)
  • Each CNOT gate followed by DP(p)
  • Each measurement preceded by BP(p) and result flipped with prob p

Phenomenological noise model:

  • Same, except each syndrome bit extraction circuit modeled as a measurement that fails with prob p. This ignores noise propagation between data and ancilla.

One other noise model is the code capacity noise model. See paper for details. 🙂
Decoders and thresholds: Can use an MLE decoder or an optimal decoder (which might be slow). The threshold seems to be around 10.9% for the optimal decoder, and not much worse for the MLE decoder: Some numerical results (depending on model) are 10.56%, 3.05%, 8times10^{-4}%

(An aside.) Random-bond Ising model: Consider an Ising model with random signs on the two-body couplings. Every CSS code has an associated random-bond Ising model. The Nishimori conjecture (which is false!) said that you can’t increase the order by increasing the temperature.
Details on syndrome extraction: One can use sequential schedules or interleaved schedules. We need to make sure that the code checks propagate to themselves, and syndrome checks propagate to themselves multiplied by a stabilizer element. They found an interleaving schedule that works.
Decoding: Use the code-capacity MLE decoder (works for all CSS codes). They formulate it as an integer program over GF(2), which can be rewritten as an integer program over the reals. (This isn’t efficient (NP-complete in general).)
Showed some pictures of the thresholds.
Turns out the interleaved schedule doesn’t have much impact… it’s about the same threshold (within error) as the sequential schedule.
Also showed some rigorous lower bounds which come from a self-avoiding walk.
Architectures and computation: can imagine a pancake architecture which does gate transversally between layers. How does this change the threshold?
Good point: Injection of magic states into codes hasn’t really been studied in the case where the injection is error-prone. This would be a good topic for future work.

Guillaume Duclos-Cianci, Héctor Bombin and David Poulin:
Equivalence of Topological Codes and Fast Decoding Algorithms

Note that I blogged about this previously.
Two gapped systems are in the same phase if there are connected by a local unitary transformation, meaning there is constant-depth quantum circuit (CDQC) which transforms one into the other.[Wen et al., 2010, Schuch et al. 2010] What makes two codes equivalent? Under what conditions does a CDQC exist?
Consider two copies of the Kitaev toric code (KTC). Does this create a new code? No… However, you could imagine applying a CDQC to the pair of codes which masks this product code. Thinking in reverse, you could ask which codes have such a character, namely, that they are loosely coupled versions of multiple copies of the KTC.
The main result is (roughly) that every 2D stabilizer code on qubits is equivalent (modulo CDQC) to several copies of the KTC, and moreover the mapping can be constructed explicitly.
Some useful concepts. Topological charges are equivalence classes of excitations which can be fused within a fixed bounded region back into the vacuum configuration. Excitation configurations are collections of defects inside a region. String operators move defects around.
The result applies in the following setting. They consider 2D lattices of qubits with geometrically local, translationally invariant stabilizer generators. To have a topological-like condition, they ask that the distance of the code scales with the linear size of the lattice (d sim L).
Gave a sketch of the proof… too detailed to blog in real time. See the paper for details.
Application to quantum error correction (QEC): The decoding problem is the same as trying to infer the worldline homology class from the measured particle configuration. You can decode, say, the 4,8,8 color code by deducing a mapping of the charges, then deducing an effective noise model on the KTCs, and then applying your favorite decoding algorithm for each KTC.
They did some numerics and found 8.7% fault-tolerance threshold for the 4,8,8 code using the RG decoder.

Joseph M. Renes, Frederic Dupuis and Renato Renner (contributed talk):
Quantum Polar Coding

Joe started off by disappointing the audience with an announcement that his all-ETHZ team had written a paper without any smooth entropies.
Instead, he talked about quantum polar codes, which are explicit, channel-adapted block quotes (CSS codes, in fact), with linear-time encoding and decoding, and achieve the symmetric (see below) coherent information for Pauli channels. They appear to need entanglement assistance, but maybe not? Details to follow.
The key idea is channel polarization. If we have two independent bits X_1, X_2 going into two bit channels with outputs Y_1, Y_2, then we can do a change of basis on the input to U_1 = X_1oplus X_2, U_2 = X_2. We think of this as a channel from U_1 to Y_1Y_2, together with a channel from U_2 to Y_1Y_2 that uses our imperfect knowledge of U_1 as side information. The first channel has less mutual information than our original channel, and the second channel has more. Hence we have “polarized” the channels, essentially taking the mutual information from one use and giving it to a different one.
The basic relation is:
I(U_1 ; Y_1 Y_2) + I(U_2 ; U_1Y_1Y_2) = 2 I(W)
where I(W) refers to the mutual information of the original channel.
If we keep doing this recursively, we can continue until we get a bunch of nearly-perfect (good) channels and a bunch of nearly-worthless (bad) channels. Remarkably, the fraction of good channels is close to the input capacity given uniform inputs (which we call the “symmetric capacity”). To encode, we send messages over the good channels, and freeze inputs to bad channels (both parties know which ones are which). This requires only nlog n CNOT gates. You can decode sequentially using ML; the recursive structure of the code makes ML efficient.
The quantum version just uses the fact that the transformation is good in the conjugate basis as well! You just turn the circuit “upside-down” (CNOTs reverse the target and control). For a Pauli channel, the polarization occurs in both amplitude AND phase, albeit in different directions. We only need to analyze amplitude and phase in order to verify that a channel can send general quantum states (see Christandl & Winter 2005 for the precise statement).
Some channels are going to be good for both amplitude and phase, so we use those to send entanglement. For the ones which are bad on amplitude, we freeze the amplitude, and similarly, for the ones which are bad on phase, we freeze the phase. We can use preshared entanglement for the channels which are bad on both amplitude and phase.
To decode, just successively and coherently decode both the channels. Reuse the classical decoders. Decode first amplitude, then phase conditional on the given amplitude. You can achieve the (symmetric) coherent information as the net rate.
From the numerical simulations, (see picture) it seems that this entanglement assistance is never actually needed!
In fact, for channels with low enough error rates they can show this rigorously. The proof techniques are similar to the original polar code proof techniques and use channel martingales.
One open question is to resolve whether or not the entanglement assistance is really necessary. Also, can they extend this to other channels? Yes! It works (joint work in progress with M. Wilde.) Plus, you can still do sequential decoding.

Sergey Bravyi and Robert Koenig:
Disorder-assisted error correction in Majorana chains

Consider, if you haven’t already, the toric code Hamiltonian H_0 given by minus the sum of toric code stabilizer generators. Now perturb it lightly by adding local Z fields: V= -msum_i Z_i. What does the storage fidelity look like (at zero temperature) as a function of time? That is, what happens if we encode an adversarially chosen pair of qubits, wait a time t, and then try to recover the qubits? Let Tstorage denote the longest time we can wait and still get 99% fidelity.
The bad news: T_{rm storage} = O(log N).
Crazy idea: Add randomness!
The idea is that the perturbation creates particles, which then hop around, creating strings that quickly lead to logical errors. But adding a little disorder can slow them down via the phenomenon of Anderson localization! In fact, the eigenfunctions can be exponentially localized. This is a nice example of a phenomenon in which something that sounds like a negative result—disorder making quantum particles move even more slowly than diffusion—turns into a postiive result, when we can use it to limit our old enemy, Eve.
Challenges: need it to work for many particles, and in 2-D. This is intractable to simulate classically.
So… instead we consider a 1D model that we can analyze, viz. unpaired Majorana modes in quantum wires (due, of course, to Kitaev).
The Hamiltonian is a sum over three types of terms: a chemical potential (a_j^dagger a_j), a hopping term (a_j^dagger a_{j+1} + h.c.), and a Cooper pair term (Delta^*a_i^dagger a_j^dagger + Delta a_ia_j). The base Hamiltonian has a 2-fold degenerate ground state (thus protecting a qubit). With respect to Majorana operators c_j, the Hamiltonian becomes the super simple:
frac{J}{2} sum_{j=1}^{N-1} i c_{2j} c_{2j+1}
The perturbation is V = sum_{j=1}^{N-1} i mu c_{2j-1} c_{2j}.
The resulting error-correction procedure resembles the minimum-weight matching procedure for the toric code, but is much simpler because it works on a line.
This code resembles the lowly repetition code, but actually has nontrivial distance.
Now what happens if the perturbation has randomly-varying weights. That is, V = sum_{j=1}^{N-1} i mu_j c_{2j-1} c_{2j} for random mu_j.
The expected storage time results can be summarized in the following table.

no disorder disorder implications
weak or no perturbations, with small N exp(Omega(N)) exp(Omega(N)) not so nice…
strong perturbations, with small N O(log N) Omega(N) nice…
large N O(log N) exp(Omega(N)) very nice!

Proofs are a mix of rigorous proofs and simulation.
Dynamical localization resembles that obtained for the 1-paricle Anderson model.
c_q(t) = sum_p R_{p,q}(t) c_p with |R_{p,q}(t)| leq e^{-|p-q|/t}.
That’s only for one particle, though. They actually need to prove a bound on the expected determinant of submatrices of R. Using this, they can show the fidelity remains high for an exponentially long time. This is interesting not only because of its implications for quantum memory, but because understanding multi-particle localization is a tough problem in general.
The proof uses quasi-adiabatic continuation for free fermions to argue that a superposition in the ground space is mostly preserved. Or more precisely, that the only damage occuring to it consists of correctable errors. The techniques look important and interesting. Properties such as gap stability that they prove might of interest elsewhere.
But they don’t solve everything: in some cases, they need numerical simulation. They use a number of tricks to tame the exponentials that would come from naive algorithms; in particular, taking advantage of the fact that they are working with fermionic Gaussian states (building on, and improving, an old result of Terhal and DiVincenzo).