## QIP 2015 live-blogging, Day 2

From the team that brought you “QIP 2015 Day 1 liveblogging“, here is the exciting sequel. Will they build a quantum computer? Will any complexity classes collapse? Will any results depend on the validity of the Extended Riemann Hypothesis? Read on and find out!

Praise from a reader of “day 1”:
QIP 2015 liveblogging — it’s almost like being there. Maybe better.

### David J. Winelandabstract Quantum state manipulation of trapped ions

Rather than “bore us” (his words) with experimental details, Dave gave a broad-brush picture of some of the progress that his lab has made over the years at improving the coherence of quantum systems.

Dave gave a history of NIST looking for more accurate clocks. Recently, a trapped near-UV transition of Hg ions at last did better than the continually improving microwave Cs standard.

At a 1994 conference at NIST, they invited Artur Ekert to speak about quantum gates. Cirac and Zoller gave the first detailed proposal for quantum computing with a linear ion trap at about this time. They were quickly able to demonstrate one of these gates in a linear ion trap.

He showed and discussed a picture of the racetrack planar ion-trap array, where ions are moved into position to perform gates throughout the trap. They can move an manipulate the ions using a scheme due to Milburn, Schneider, James, Sorenson, and Molmer that uses position dependent dipole forces. The transverse Ising model can be simulated by applying a moving standing wave to ions in a linear trap; this is a test case for useful simulations.

Other groups at NIST have also done impressive work on quantum simulation. Bollinger’s group has made a self-assembled triangular lattice with Ising-type couplings that we talked about previously here on the Pontiff.

Everyone in the ion trap business is plagued by something called “anomalous heating”, of unknown origin, which gets worse as the length scale gets smaller. Colleagues studying surface science have suggested using an argon ion cannon (damn, that sounds impressive) to blast away impurities in the surface trap electrodes, scrubbing the surface clean. This has reduced anomalous heating 100 fold, but it’s still above all known electronic causes. Using cryogenic cooling helps too, as has been done by Ike Chuang’s group at MIT.

Laser intensity fluctuations at the site of the ions is another continual source of error. Optical and IR beams can be efficiently transmitted and positioned by optical fibers, but UV beams create color centers and degrade optical fiber on a timescale of about an hour. Recent work by the group has shown that this degradation timescale can be extended somewhat.

Dave showed a list, and there are about 30+ groups around the world working on ion-trap quantum information processing. Pretty impressive!

Dave showed this Time magazine cover that calls D-Wave the “Infinity Machine” that no one understands. In contrast, he says, we know how quantum computing works… and how it doesn’t. Sober experimentalists seem to be in rough agreement that

• A factoring machine is decades away.
• Quantum simulation may be possible within the next decade.
• The real excitement will be a simulation that tells us something new about physics.

### Joel Wallman and Steve Flammia Randomized Benchmarking with ConfidenceabstractarXiv:1404.6025

Randomized benchmarking is a standard method whereby experimental implementations of quantum gates can be assessed for their average-case accuracy in a way that doesn’t conflate the noise on the gates with the noise of state preparation and measurement (SPAM) errors.

The protocol is simple:

• Choose a random sequence of $m$ Clifford gates
• prepare the initial state in computational basis
• Apply the Clifford gate sequence and then the inverse gate at the end
• Measure in computational basis.

Repeat this for many random sequences and many repetitions of the each sequence to get statistics. Under a certain noise model called the “0th order model”, the averages of this procedure for different values of $m$ will fit to a model of the form $F_m = A + B f^m$ where $f$ is a quantity closely related to the average quality of the gates in the sequence. Define $r$ to be the average error rate. (Morally, this is equivalent to “1-f”, in the above model, but the actual formula is more complicated.) To understand the convergence of this protocol to an estimate, we need to understand the variance as a function of $m,r$.

The main contribution is to reduce the variance bound from the trivial bound of $O(1)$ to $O(mr)$. This provides a good guide on how to choose optimal lengths $m$ for experiments, and the bounds are nearly exact in the case of a single qubit. In the parameter range of interest, this improved over previous estimates of the sample complexity by three orders of magnitude.

### Fernando Brandao, Marcus Cramer and Madalin Guta A Berry-Esseen Theorem for Quantum Lattice Systems and the Equivalence of Statistical Mechanical Ensemblesabstract

The full version is not yet on the arxiv, but due to an author mistake, the above link gives the long version of QIP submission. Download it there while you still can!

Quantum many-body systems are pretty wild objects, with states in $2^{10^{23}}$ dimensions or even worse. But we often have a mental models of them as basically like non-interacting spins. In some cases, the renormalization group and other arguments can partially justify this. One thing that’s true in the case of non-interacting spins is that the density of states is approximately Gaussian. The idea here is to show that this still holds when we replace “non-interacting spins” with something morally similar, such as exponentially decaying correlations, bounded-range interactions, etc.

This way of writing it makes it sound trivial. But major open questions like the area law fit into this framework, and proving most statements is difficult. So technical advances in validating our “finite correlation length looks like non-interacting spins” intuition can be valuable.

Today’s technical advance is a quantum version of the Berry-Esseen theorem. The usual Berry-Esseen theorem gives quantitative bounds on the convergence to the mean that we get from the central limit theorem. Here we consider a lattice version, where we consider spins on a d-dimensional lattice and local observables A and B that act on subsets of spins separated by a distance L. We require a finite correlation length, as we get for example, for all Gibbs states above some critical temperature (or at any nonzero temperature in D=1).

What does a (quantitative) CLT give us beyond mere large deviation bounds? It shows that the density of states (at least those inhabited by the particular state $\rho$) is roughly Gaussian thereby roughly matching what we would get from a tensor power state. This is somewhat stronger than the “typical subspace”-type guarantees that we would get from a large deviation bounds.

The main application here is an equivalence theorem between the canonical and microcanonical ensembles: i.e. between the Gibbs state and a uniform mixture over an energy band of width $O(\sqrt N)$. These states are far apart in trace distance, but this paper shows that they look similar with respect to sufficiently local observables. If you think this sounds easy, well, then try to prove it yourself, and then once you give up, read this paper.

### Michael Kastoryano and Fernando Brandao Quantum Gibbs Samplers: the commuting caseabstractarXiv:1409.3435

How efficiently can we prepare thermal states on a quantum computer? There is a related question: how does nature prepare states? That is, what is the natural rate for thermalization given a quantum lattice system? There are two possible ways to model thermalization, both of which are computationally efficient. “Davies generators” mean local jumps that can be modeled as local interactions with a Markovian bath at a fixed temperature, while “heat-bath generators” mean that we repeatedly apply the Petz recovery map to small blocks of spins. Call both “Gibbs samplers.”

Consider the setting where you have a system living on a lattice with a bit or qubit on each site, and some memoryless, spatially local, dynamics. Classically the powerful tools of DLR (Dobrushin-Lanford-Ruelle) theory imply a close relation between properties of the dynamics and properties of the stationary state. Specifically, spatial mixing (meaning decaying correlations in the stationary state) can be related to temporal mixing (meaning that the dynamics converge rapidly to the stationary state). (The best reference I know is Martinelli, but for a more CS-friendly version, see also this paper.)
An exact quantum analogy to this cannot be reasonably defined, since the classical definition involves conditioning – which often is the reason classical information theory ideas fail to translate into the quantum case.

One of the first contributions of this work then is to define quantum notions of “weak clustering” (more or less the familiar exponential decay of correlations between well-separated observables) and “strong clustering” (a more complicated definition involving overlapping regions). Then the main result is that there is an intimate connection between the rate of convergence of any quantum algorithm for reaching the Gibbs state and the correlations in the Gibbs state itself. Namely: strong clustering (but not weak clustering) is equivalent to rapid mixing of the Gibbs sampler. Everything here assumes commuting Hamiltonians, by the way. Also, “rapid mixing” is equivalent to the Gibbs sampler being gapped (think of this like the quantum version of being a gapped Markov chain).

One direction is fairly straightforward. To show that strong clustering implies a gapped Gibbs sampler, we directly apply the variational characterization of the gap. (The dynamics of a continuous-time Gibbs sampler can be written as $\dot\rho = -\mathcal{A}[\rho]$ for some linear superoperator $\mathcal{A}$, which we will assume to be Hermitian for convenience. $\mathcal{A}$ has all nonnegative eigenvalues because it is stable, and it has a single eigenvalue equal to 0, corresponding to the unique stationary distribution. The gap is given by the smallest positive eigenvalue, and this “smallest” is what gives rise to the variational characterization. See their paper for details.) The variational calculation involves a minimization over (global) states and strong clustering lets us reduce this to calculations involving local states that are much easier to bound.

In the other direction (gap implies strong clustering), we relate the Gibbs sampler to a local Hamiltonian, and use the detectability lemma, which in fact was originally used in part to prove a statement about decay of correlations. The idea is to construct an AGSP (approximate ground-state projector) which is a low-degree polynomial of the Hamiltonian. Because it’s low degree, applying it does not increase the entanglement across any cut by much (useful for proving area laws) or does not propagate correlations far (along the lines of Lieb-Robinson; useful for correlation decay).

When can these results be applied? In 1-D, strong and weak clustering are equivalent (because boundary terms can be removed), and therefore both are implied by (Hamiltonian) gap. Also in any number of spatial dimensions, above a universal critical temperature the Gibbs samplers are always gapped.

Some open questions:

• If in 2-D, one could also show strong=weak clustering (as is known classically in <3 dimensions), it would nail the coffin of 2d quantum memory for any commuting Hamiltonian.
• Classically, there is a dichotomy result: either there is very rapid mixing (log(N) time) or very slow (exp(N)) time. Here they can only get poly(N) mixing. Can these results be extended to the log-Sobolev type bounds that give this type of result?

### Mehmet Burak Şahinoğlu, Dominic Williamson, Nick Bultinck, Michael Marien, Jutho Haegeman, Norbert Schuch and Frank Verstraete Characterizing Topological Order with Matrix Product Operators MERGED WITH Oliver Buerschaper Matrix Product Operators: Local Equivalence and Topological Orderabstract-137arXiv:1409.2150abstract-176

Characterizing topological quantum order is a challenging problem in many-body physics. In two dimensions, it is generally accepted that all topologically ordered ground states are described (in a long-range limit) by a theory of anyons. These anyonic theories have characteristic features like topology-dependent degeneracy and local indistinguishability in the ground space and string-like operators that map between these ground states.

The most famous example of this is Kitaev’s toric code, and we are interested in it at a quantum information conference because of its ability to act as a natural quantum error-correcting code. The four ground states of the toric code can be considered as a loop gas, where each ground state is a uniform superposition of all loops on the torus satisfying a given parity constraint.

The goal in this talk is to classify types of topological order using the formalism of matrix product states, and their slightly more general cousins, matrix product operators (MPO). The authors define an algebra for MPOs that mimics the algebra of loop operators in a topologically ordered material. Because matrix product operators have efficient descriptions classically, they are well suited to numerical studies, and their structure also allows them to be used for analytical investigations.

The main idea that the authors introduce is a condition on MPO operators so that they behave like topological operators. In particular, they obey a “deformation” condition that lets them be pushed around the lattice, just like Wilson loops.

The authors used this idea to study models that are not stabilizer codes, such as the double semion model and more generally the class of string-net models. This looks like a very promising tool for studying topological order.

### Dorit Aharonov, Aram Harrow, Zeph Landau, Daniel Nagaj, Mario Szegedy and Umesh Vazirani Local tests of global entanglement and a counterexample to the generalized area lawabstract1410.0951

Steve: “Counterexamples to the generalized area law” is an implicit admission that they just disproved something that nobody was conjecturing in the first place. 😉

### Xiaotong Ni, Oliver Buerschaper and Maarten Van Den Nest A non-commuting Stabilizer FormalismabstractarXiv:1404.5327

This paper introduces a new formalism called the “XS stabilizer” formalism that allows you to describe states in an analogous way to the standard stabilizer formalism, but where the matrices in the group don’t commute. The collection of matrices is generated by $X, S, \alpha$, where $\alpha = \sqrt{i}$ and $S = \sqrt{Z}$ on $n$ qubits. A state or subspace that is stabilized by a subgroup of these operators is said to be an XS stabilizer state or code. Although these are, as Xiaotong says, “innocent-looking tensor product operators”, the stabilizer states and codes can be very highly entangled.

One of the applications of this formalism is to classify the double semion model, which is a local Hamiltonian model with topological order. There are sets of general conditions for when such states and codes can be ground states of local commuting XS Hamiltonians. Unfortunately, not all of these properties can be computed efficiently; some of these properties are NP-complete to compute. There are some interesting open questions here, for example what class of commuting projector Hamiltonians ground states are in NP?

### Dave Touchette Direct Sum Theorem for Bounded Round Quantum Communication Complexity and a New, Fully Quantum Notion of Information Complexity (Recipient of the QIP2015 Best Student Paper Prize)abstractarXiv:1409.4391

“Information complexity” is a variant of communication complexity that measures not the number of bits exchanged in a protocol but the amount of “information”, however that is defined. Here is a series of tutorials for the classical case. Entropy is one possibility, since this would put an upper bound on the asymptotic compressibility of many parallel repetitions of a protocol. But in general this gives up too much. If Alice holds random variables AC, Bob holds random variable B and Alice wants to send C to Bob then the cost of this is (asymptotically) $I(A:C|B)$.

This claim has a number of qualifications. It is asymptotic and approximate, meaning that it holds in the limit of many copies. However, see 1410.3031 for a one-shot version. And when communication is measured in qubits, the amount is actually $\frac{1}{2} I(A:C|B)$.

Defining this correctly for multiple messages is tricky. In the classical case, there is a well-defined “transcript” (call it T) of all the messages, and we can define information cost as $I(X:T|Y) + I(Y:T|X)$, where X,Y are the inputs for Alice and Bob respectively. In the quantum case we realize that the very idea of a transcript implicitly uses the principle that (classical) information can be freely copied, and so for quantum protocols we cannot use it. Instead Dave just sums the QCMI (quantum conditional mutual information) of each step of the protocol. This means $I(A:M|B)$ when Alice sends $M$ to Bob and $I(B:M|A)$ when Bob sends $A$ to Alice. Here $A,B$ refer to the entire systems of Alice/Bob respectively. (Earlier work by Yao and Cleve-Buhrman approached this in other, less ideal, ways.)

When minimized over all valid protocols, Dave’s version of Quantum Information Complexity represents exactly the amortized quantum communication complexity. This sounds awesome, but there are a bunch of asterisks. First “minimized over all valid protocols,” is an unbounded minimization (and some of these protocols really do use an infinite number of rounds), although it is in a sense “single-shot” in that it’s considering only protocols for calculating the function once. Also “amortized” here is not quite the same as in Shannon theory. When we talk about the capacity of a channel or its simulation cost (as in these sense of reverse Shannon theorems) we usually demand that the block error rate approach zero. In this case, the information complexity is defined in terms of an error parameter $\epsilon$ (i.e. it is the minimum sum of QCMI’s over all protocols that compute the function up to error $\epsilon$). This then corresponds to the asymptotic cost of simulating a large number of evaluations of the function, each of which is allowed to err with probability $\epsilon$. The analogue in Shannon theory is something called rate-distortion theory.

Before you turn up your nose, though, the current talk gets rid of this amortized restriction. QIC (quantum information complexity) is easily seen to be a lower bound for the communication complexity and this work shows that it is also an upper bound. At least up to a multiplicative factor of $1/\epsilon^2$ and an additive term that also scales with the number of rounds. Since QIC is also a lower bound for the above amortized version of complexity, this proves a direct sum theorem, meaning that computing $n$ function values costs $\Omega(n)$ as much as one function evaluation. Here the weak amortized definition actually makes the result stronger, since we are proving lower bounds on the communication cost. In other words, the lower bound also applies to the case of low block-wise error.

The technical tools are the one-shot redistribution protocol mentioned above (see also this version) and the Jain-Radhakrishnan-Sen substate theorem (recently reproved in 1103.6067 and the subject of a press release that I suppose justifies calling this a “celebrated” theorem). I should write a blog post about how much I hate it when people refer to “celebrated” theorems. Personally I celebrate things like Thanksgiving and New Year’s, not the PCP theorem. But I digress.

### Toby Cubitt, David Elkouss, William Matthews, Maris Ozols, David Perez-Garcia and Sergii Strelchuk Unbounded number of channel uses are required to see quantum capacityabstractarXiv:1408.5115

Is the quantum capacity of a quantum channel our field’s version of string theory? Along the lines of this great Peter Shor book review, quantum Shannon theory has yielded some delightful pleasant surprises, but our attempts to prove an analogue of Shannon’s famous formula $C=\max_p I(A:B)$ has turned into a quagmire that has now lasted longer than the Vietnam War.

Today’s talk is the latest grim news on this front. Yes, we have a capacity theorem for the (unassisted) quantum capacity, the famous LSD theorem, but it requires “regularization” meaning maximizing a rescaled entropic quantity over an unbounded number of channel uses. Of course the definition of the capacity itself involves a maximization over an unbounded number of channel uses, so formally speaking we are not better off, although in practice the capacity formula can often give decent lower bounds. On the other hand, we still don’t know if it is even decidable.

Specifically the capacity formula is

$\displaystyle Q = \lim_{n\rightarrow\infty} Q^{(n)} := \lim_{n\rightarrow\infty} \frac{1}{n} \max_\rho I_c(\mathcal{N}^{\otimes n}, \rho)$,

where $\rho$ is maximized over all inputs to n uses of the channel and $I_c$ is the coherent information (see paper for def). In evaluating this formula, how large do we have to take n? e.g. could prove that we always have $Q^{(n)} \geq (1-1/n)Q$? If this, or some formula like it, were true then we would get an explicit upper bound on the complexity of estimating capacity.

The main result here is to give us bad news on this front, in fairly strong terms. For any $n$ they define a channel for which $Q^{(n)}=0$ but $Q>0$.

Thus we need an unbounded number of channel uses to detect whether the quantum capacity (ie the regularized coherent information) is even zero or nonzero.

The talk reviews other non-additivity examples, including classical, private, zero-error quantum and classical capacities. Are there any good review articles here?

Here’s how the construction works. It builds on the Smith-Yard superactivation result, which combines an erasure channel (whose lack of capacity follows from the no-cloning theorem) and a PPT channel (whose lack of capacity follows from being PPT). The PPT channel is chosen to be able to send private information (we know these exist from a paper by H3O) and by using the structure of these states (further developed in later three-Horodecki-and-an-Oppenheim work), one can show that combining this with an erasure channel can send some quantum information. Specifically the PPT channel produces a “shield” which, if faithfully transmitted to Bob, enables perfect quantum communication.

This new construction is similar but uses a shield with many parts any one of which can be used to extract a valid quantum state. On the other hand, the erasure probability is increased nearly to one, and noise is added as well. Proving this is pretty tough and involves sending many things to zero or infinity at varying rates.

During question period prolonged jocular discussion triggered by John Smolin saying title was inappropriate, since the authors had clearly shown that by examining the parameters of the channel the quantum capacity was positive, so detecting positivity of capacity required no channel uses.

D. Gottesman suggested a more operational interpretation of title, given a black box, how many uses of it are needed to decide whether its quantum capacity was positive. If it was, e.g. an erasure channel with erasure probability very close to 1/2, arbitrarily many uses would be needed to confidently decide. It’s not clear how to formalize this model.

By the way, better Shannon theory news is coming in a few days for bosonic channels with the talk by Andrea Mari.

Posted in Conferences, Liveblogging | 1 Comment

## QIP 2015 Return of the Live-blogging, Day 1

Jan 14 update at the end.

The three Pontiffs are reunited at QIP 2015 and, having forgotten how painful liveblogging was in the past, are doing it again. This time we will aim for some slightly more selective comments.

In an ideal world the QIP PC would have written these sorts of summaries and posted them on scirate, but instead they are posted on easychair where most of you can’t access them. Sorry about this! We will argue at the business meeting for a more open refereeing process.

The first plenary talk was:

### Ran Raz (Weizmann Institute) How to Delegate Computations: The Power of No-Signaling ProofsTR13-183

Why is the set of no-signalling distributions worth looking at? (That is, the set of conditional probability distributions $p(a,b|x,y)$ that have well-defined marginals $p(a|x)$ and $p(b|y)$.) One way to think about it is as a relaxation of the set of “quantum” distributions, meaning the input-output distributions that are compatible with entangled states. The no-signalling polytope is defined by a polynomial number of linear constraints, and so is the sort of relaxation that is amenable to linear programming, whereas we don’t even know whether the quantum value of a game is computable. But is the no-signalling condition ever interesting in itself?

Raz and his coauthors (Yael Kalai and Ron Rothblum) prove a major result (which we’ll get to below) about the computational power of multi-prover proof systems where the provers have access to arbitrary non-signalling distributions. But they began by trying to prove an apparently unrelated classical crypto result. In general, multiple provers are stronger than one prover. Classically we have MIP=NEXP and IP=PSPACE, and in fact that MIP protocol just requires one round, whereas k rounds with a single prover is (roughly) within the k’th level of the polynomial hierarchy (i.e. even below PSPACE). So simulating many provers with one prover seems in general crazy.

But suppose instead the provers are computationally limited. Suppose they are strong enough for the problem to be interesting (i.e. they are much stronger than the verifier, so it is worthwhile for the verifier to delegate some nontrivial computation to them) but to weak to break some FHE (fully homomorphic encryption) scheme. This requires computational assumptions, but nothing too outlandish. Then the situation might be very different. If the verifier sends its queries using FHE, then one prover might simulate many provers without compromising security. This was the intuition of a paper from 2000, which Raz and coauthors finally are able to prove. The catch is that even though the single prover can’t break the FHE, it can let its simulated provers play according to a no-signalling distribution. (Or at least this possibility cannot be ruled out.) So proving the security of 1-prover delegated computation requires not only the computational assumptions used for FHE, but also a multi-prover proof system that is secure against no-signalling distributions.

Via this route, Raz and coauthors found themselves in QIP territory. When they started it was known that

• MIPns[2 provers]=PSPACE [0908.2363]
• PSPACE $\subseteq$ MIPns[poly provers] $\subseteq$ EXP [0810.0693]

This work nails down the complexity of the many-prover setting, showing that EXP is contained in MIPns[poly provers], so that in fact that classes are equal.

It is a nice open question whether the same is true for a constant number of provers, say 3. By comparison, three entangled provers or two classical provers are strong enough to contain NEXP.

One beautiful consequence is that optimizing a linear function over the no-signalling polytope is roughly a P-complete problem. Previously it was known that linear programming was P-complete, meaning that it was unlikely to be solvable in, say, log space. But this work shows that this is true even if the constraints are fixed once and for all, and only the objective function is varied. (And we allow error.) This is established in a recent followup paper [ECCC TR14-170] by two of the same authors.

### Francois Le Gall. Improved Quantum Algorithm for Triangle Finding via Combinatorial ArgumentsabstractarXiv:1407.0085

A technical tour-de-force that we will not do justice to here. One intriguing barrier-breaking aspect of the work is that all previous algorithms for triangle finding worked equally well for the standard unweighted case as well as a weighted variant in which each edge is labeled by a number and the goal is to find a set of edges $(a,b), (b,c), (c,a)$ whose weights add up to a particular target. Indeed this algorithm has a query complexity for the unweighted case that is known to be impossible for the weighted version. A related point is that this shows the limitations of the otherwise versatile non-adaptive learning-graph method.

### Ryan O’Donnell and John Wright Quantum Spectrum TestingabstractarXiv:1501.05028

A classic problem: given $\rho^{\otimes n}$ for $\rho$ an unknown d-dimensional state, estimate some property of $\rho$. One problem where the answer is still shockingly unknown is to estimate $\hat\rho$ in a way that achieves $\mathbb{E} \|\rho-\hat \rho\|_1 \leq\epsilon$.
Results from compressed sensing show that $n = \tilde\Theta(d^2r^2)$ for single-copy two-outcome measurements of rank-$r$ states with constant error, but if we allow block measurements then maybe we can do better. Perhaps $O(d^2/\epsilon)$ is possible using using the Local Asymptotic Normality results of Guta and Kahn [0804.3876], as Hayashi has told me, but the details are – if we are feeling generous – still implicit. I hope that he, or somebody, works them out. (18 Jan update: thanks Ashley for fixing a bug in an earlier version of this.)

The current talk focuses instead on properties of the spectrum, e.g. how many copies are needed to distinguish a maximally mixed state of rank $r$ from one of rank $r+c$? The symmetry of the problem (invariant under both permutations and rotations of the form $U^{\otimes n}$) means that we can WLOG consider “weak Schur sampling” meaning that we measure which $S_n \times U_d$ irrep our state lies in, and output some function of this result. This irrep is described by an integer partition which, when normalized, is a sort of mangled estimate of the spectrum. It remains only to analyze the accuracy of this estimator in various ways. In many of the interesting cases we can say something nontrivial even if $n= o(d^2)$. This involves some delicate calculations using a lot of symmetric polynomials. Some of these first steps (including many of the canonical ones worked out much earlier by people like Werner) are in my paper quant-ph/0609110 with Childs and Wocjan. But the current work goes far far beyond our old paper and introduces many new tools.

### Han-Hsuan Lin and Cedric Yen-Yu Lin. Upper bounds on quantum query complexity inspired by the Elitzur-Vaidman bomb testerabstractarXiv:1410.0932

This talk considers a new model of query complexity inspired by the Elitzur-Vaidman bomb tester. The bomb tester is a classic demonstration of quantum weirdness: You have a collection of bombs that have a detonation device so sensitive that even a single photon impacting it will set it off. Some of these bombs are live and some are duds, and you’d like to know which is which. Classically, you don’t stand a chance, but quantum mechanically, you can put a photon into a beamsplitter and place the bomb in one arm of a Mach-Zender interferometer. A dud will destroy the interference effects, and a homodyne detector will always click the same way. But you have a 50/50 chance of detecting a live bomb if the other detector clicks! There are various tricks that you can play related to the quantum Zeno effect that let you do much better than this 50% success probability.

The authors define a model of query complexity where one risks explosion for some events, and they showed that the quantum query complexity is related to the bomb query complexity by $B(f) = \Theta(Q(f)^2)$. There were several other interesting results in this talk, but we ran out of steam as it was the last talk before lunch.

### Kirsten Eisentraeger, Sean Hallgren, Alexei Kitaev and Fang Song A quantum algorithm for computing the unit group of an arbitrary degree number fieldSTOC 2014

The unit group is a fundamental object in algebraic number theory. It comes up frequently in applications as well, and is used for fully homomorphic encryption, code obfuscation, and many other things.

My [Steve] personal way of understanding the unit group of a number field is that it is a sort of gauge group with respect to the factoring problem. The units in a ring are those numbers with multiplicative inverses. In the ring of integers, where the units are just $\pm1$ , we can factor composite numbers into $6 = 3 \times 2 = (-3)\times (-2)$. Both of these are equally valid factorizations; they are equivalent modulo units. In more complicated settings where unique factorization fails, we have factorization into prime ideals, and the group of units can in general become infinite (though always discrete).

The main result of this talk is a quantum algorithm for finding the unit group of a number field of arbitrary degree. One of the technical problems that they had to solve to get this result was to solve the hidden subgroup problem on a continuous group, namely $\mathbb{R}^n$.

The speaker also announced some work in progress: a quantum algorithm for the principal ideal problem and the class group problem in arbitrary degree number fields [Biasse Song ‘14]. It sounds like not all the details of this are finished yet.

### Dominic Berry, Andrew Childs and Robin Kothari Hamiltonian simulation with nearly optimal dependence on all parametersabstract1501.01715

Hamiltonian simulation is not only the original killer app of quantum computers, but also a key subroutine in a large and growing number of problems. I remember thinking it was pretty slick that higher-order Trotter-Suzuki could achieve a run-time of $\|H\|t\text{poly}(s)(\|H\|t/\epsilon)^{o(1)}$ where $t$ is the time we simulate the Hamiltonian for and $s$ is the sparsity. I also remember believing that the known optimality thoerems for Trotter-Suzuki (sorry I can’t find the reference, but it involves decomposing $e^{t(A+B)}$ for the free Lie algebra generated by $A,B$) meant that this was essentially optimal.

Fortunately, Berry, Childs and Kothari (and in other work, Cleve) weren’t so pessimistic, and have blasted past this implicit barrier. This work synthesizes everything that comes before to achieve a run-time of $\tau \text{poly}\log(\tau/\epsilon)$ where $\tau = \|H\|_{\max}st$ (where $\|H\|_{\max}$ is $\max_{i,j} |H_{i,j}|$ can be related to the earlier bounds via $\|H\| \leq d \|H\|_{\max}$).

One quote I liked: “but this is just a generating function for Bessel functions!” Miraculously, Dominic makes that sound encouraging. The lesson I suppose is to find an important problem (like Hamiltonian simulation) and to approach it with courage.

### Salman Beigi and Amin Gohari Wiring of No-Signaling Boxes Expands the Hypercontractivity RibbonabstractarXiv:1409.3665

If you have some salt water with salt concentration 0.1% and some more with concentration 0.2%, then anything in the range [0.1, 0.2] is possible, but no amount of mixing will give you even a single drop with concentration 0.05% or 0.3%, even if you start with oceans at the initial concentrations. Similarly if Alice and Bob share an unlimited number of locally unbiased random bits with correlation $\eta$ they cannot produce even a single bit with correlation $\eta' > \eta$ if they don’t communicate. This was famously proved by Reingold, Vadhan and Wigderson.

This talk does the same thing for no-signaling boxes. Let’s just think about noisy PR boxes to make this concrete. The exciting thing about this work is that it doesn’t just prove a no-distillation theorem but it defines an innovative new framework for doing so. The desired result feels like something from information theory, in that there is a monotonicity argument, but it needs to use quantities that do not increase with tensor product.

Here is one such quantity. Define the classical correlation measure $\rho(A,B) = \max \text{Cov}(f,g)$ where $f:A\mapsto \mathbb{R}$, $g:B\mapsto \mathbb{R}$ and each have variance 1. Properties:

• $0 \leq \rho(A,B) \leq 1$
• $\rho(A,B) =0$ iff $p_{AB} = p_A \cdot p_B$
• $\rho(A^n, B^n) = \rho(A,B)$
• for any no-signaling box, $\rho(A,B) \leq \max(\rho(A,B|X,Y), \rho(X,Y))$

Together this shows that any wiring of boxes cannot increase this quantity.

The proof of this involves a more sophisticated correlation measure that is not just a single number but is a region called the hypercontractivity ribbon (originally due to [Ahlswede, Gacs ‘76]). This is defined to be the set of $(\lambda_1, \lambda_2)$ such that for any $f,g$ we have
$\mathbb{E}[f_A g_B] \leq \|f_A\|_{\frac{1}{\lambda_1}} \|g_B\|_{\frac{1}{\lambda_2}}$
A remarkable result of [Nair ‘14] is that this is equivalent to the condition that
$I(U;AB) \geq \lambda_1 I(U:A) + \lambda_2 I(U:B)$
for any extension of the distribution on AB to one on ABU.

Some properties.

• The ribbon is $[0,1]\times [0,1]$ iff A,B are independent.
• It is stable under tensor power.
• monotonicity: local operations on A,B enlarge $R$

For boxes define $R(A,B|X,Y) = \cap_{x,y} R(A,B|x,y)$. The main theorem is then that rewiring never shrinks hypercontractivity ribbon. And as a result, PR box noise cannot be reduced.

These techniques are beautiful and seem as though they should have further application.

### Masahito Hayashi Estimation of group action with energy constraintabstractarXiv:1209.3463

Your humble bloggers were at this point also facing an energy constraint which limited our ability to estimate what happened. The setting is that you pick a state, nature applies a unitary (specifically from a group representation) and then you pick a measurement and try to minimize the expected error in estimating the group element corresponding to what nature did. The upshot is that entanglement seems to give a quadratic improvement in metrology. Noise (generally) destroys this. This talk showed that a natural energy constraint on the input also destroys this. One interesting question from Andreas Winter was about what happens when energy constraints are applied also to the measurement, along the lines of 1211.2101 by Navascues and Popescu.

Jan 14 update: forgot one! Sorry Ashley.

### Ashley Montanaro Quantum pattern matching fast on averageabstractarXiv:1408.1816

Continuing the theme of producing shocking and sometimes superpolynomial speedups to average-case problems, Ashley shows that finding a random pattern of length $m$ in a random text of length $n$ can be done in quantum time $\tilde O(\sqrt{n/m}\exp(\sqrt{\log m}))$. Here “random” means something subtle. The text is uniformly random and the pattern is either uniformly random (in the “no” case) or is a random substring of the text (in the “yes” case). There is also a higher-dimensional generalization of the result.

One exciting thing about this is that it is a fairly natural application of Kuperberg’s algorithm for the dihedral-group HSP; in fact the first such application, although Kuperberg’s original paper does mention a much less natural such variant. (correction: not really the first – see Andrew’s comment below.)

It is interesting to think about this result in the context of the general question about quantum speedups for promise problems. It has long been known that query complexity cannot be improved by more than a polynomial (perhaps quadratic) factor for total functions. The dramatic speedups for things like the HSP, welded trees and even more contrived problems must then use the fact that they work for partial functions, and indeed even “structured” functions. Pattern matching is of course a total function, but not one that will ever be hard on average over a distribution with, say, i.i.d. inputs. Unless the pattern is somehow planted in the text, most distributions simply fail to match with overwhelming probability. It is funny that for i.i.d. bit strings this stops being true when $m = O(\log n)$, which is almost exactly when Ashley’s speedup becomes merely quadratic. So pattern matching is a total function whose hard distributions all look “partial” in some way, at least when quantum speedups are possible. This is somewhat vague, and it may be that some paper out there expresses the idea more clearly.

Part of the strength of this paper is then finding a problem where the promise is so natural. It gives me new hope for the future relevance of things like the HSP.

Posted in Conferences, Liveblogging | 6 Comments

## Your Guide to Australian Slang for QIP Sydney

To everyone that’s attending QIP, welcome to Sydney!

Since I’ve already had to clarify a number of the finer points of Australian slang to my fellow attendees, I thought I would solve the general problem and simply post a helpful dictionary that translates some uniquely Australian words and usages into standard American English.

Also, this thing on the right is called an ibis. It’s not venomous.

## Coffee

Flat white – Try this at least once while you’re here, preferably prepared by a highly skilled barista at one of the better cafes. It’s similar to a latte or to a cappuccino without the foam, but there are important differences.

Long black – Australian version of the Americano, a bit stronger and with crema. It’s the closest you’ll get to a cup of filtered drip coffee, if that’s your thing.

Short black – If you want a standard espresso, order a short black.

## The Beach

Thongs – Sandals, or flip-flops. The highest level of dress code in Australia is “no thongs”.

Togs – Swimwear.

Esky – A cooler; the place where you store your beer to keep it cold while you’re getting pissed at the beach.

Pissed – Drunk; the state that a nontrivial fraction of people are in because it’s legal to drink at the beach.

Sunnies – Sunglasses.

Mozzy – Mosquito. Usually not a problem at the beach because there is almost always a breeze.

## The Pub

Schooner – (SKOO-ner) A medium-sized glass of beer.

Jug – A pitcher of beer.

Shout – To buy a beer for someone, or a round of beers for your table.

Skol – To chug a beer. Usage: “Hey Robbo, if you skol that schooner I’ll shout you a jug.”

Hotel – In addition to the standard meaning, a hotel is a particular style of pub. It usually has high occupancy and a limited beer selection (though this is starting to improve as craft beer is finally catching on here).

## Sports

Football – see “Footy”.

Footy – Rugby. It comes in several varieties, with League and Union being the two most popular varieties.

Gridiron – American football. Not generally watched much down under.

Cricket – An inscrutable game that takes 5 days to play. I think the only way you could like this game is to have the British invade, conquer your land, and occupy your territory under their colonial yoke for at least a few generations. That seems to be how everyone else got into it.

Rooting – Do not make the mistake of saying that you are “rooting for team X”; in Australia, rooting is slang for having sex.

## Miscellaneous

Arvo – Afternoon.

Bickie – A cookie or biscuit.

Brekkie – Breakfast.

Fair dinkum – The closest translation is probably “for real”. It’s used to express the sentiment that you’re not deceiving the listener or exaggerating your claims.

## Should Papers Have Unit Tests?

Perhaps the greatest shock I’ve had in moving from the hallowed halls of academia to the workman depths of everyday software development is the amount of testing that is done when writing code. Likely I’ve written more test code than non-test code over the last three plus years at Google. The most common type of test I write is a “unit test”, in which a small portion of code is tested for correctness (hey Class, do you do what you say?). The second most common type is an “integration test”, which attempts to test that the units working together are functioning properly (hey Server, do you really do what you say?). Testing has many benefits: correctness of code, of course, but it is also important for ease of changing code (refactoring), supporting decoupled and simplified design (untestable code is often a sign that your units are too complicated, or that your units are too tightly coupled), and more.

Over the holiday break, I’ve been working on a paper (old habit, I know) with lots of details that I’d like to make sure I get correct. Throughout the entire paper writing process, one spends a lot of time checking and rechecking the correctness of the arguments. And so the thought came to my mind while writing this paper, “boy it sure would be easier to write this paper if I could write tests to verify my arguments.”

In a larger sense, all papers are a series of tests, small arguments convincing the reader of the veracity or likelihood of the given argument. And testing in a programming environment has a vital distinction that the tests are automated, with the added benefit that you can run them often as you change code and gain confidence that the contracts enforced by the tests have not been broken. But perhaps there would be a benefit to writing a separate argument section with “unit tests” for different portions of a main argument in a paper. Such unit test sections could be small, self-contained, and serve as supplemental reading that could be done to help a reader gain confidence in the claims of the main text.

I think some of the benefits for having a section of “unit tests” in a paper would be

• Documenting limit tests A common trick of the trade in physics papers is to take a parameter to a limiting value to see how the equations behave. Often one can recover known results in such limits, or show that certain relations hold after you scale these. These types of arguments give you confidence in a result, but are often left out of papers. This is sort of kin to edge case testing by programmers.
• Small examples When a paper gets abstract, one often spends a lot of time trying to ground oneself by working with small examples (unless you are Grothendieck, of course.) Often one writes a paper by interjecting these examples in the main flow of the paper, but these sort of more naturally fit in a unit testing section.
• Alternative explanation testing When you read an experimental physics paper, you often wonder, am I really supposed to believe the effect that they are talking about. Often large portions of the paper are devoted to trying to settle such arguments, but when you listen to experimentalists grill each other you find that there is an even further depth to these arguments. “Did you consider that your laser is actually exciting X, and all you’re seeing is Y?” The amount of this that goes on is huge, and sadly, not documented for the greater community.
• Combinatorial or property checks Often one finds oneself checking that a result works by doing something like counting instances to check that they sum to a total, or that a property holds before and after a transformation (an invariant). While these are useful for providing evidence that an argument is correct, they can often feel a bit out of place in a main argument.

Of course it would be wonderful if there we a way that these little “units” could be automatically executed. But the best path I can think of right now towards getting to that starts with the construction of an artificial mind. (Yeah, I think perhaps I’ve been at Google too long.)

Posted in Off The Deep End, Programming | 5 Comments

## Self-correcting Fractals

A really exciting paper appeared on the arxiv today: A proposal for self-correcting stabilizer quantum memories in 3 dimensions (or slightly less), by Courtney Brell. It gives the strongest evidence yet that self-correcting quantum memories are possible in “physically realistic” three-dimensional lattice models. In particular, Courtney has constructed families of local Hamiltonians in 3D whose terms consist of X- and Z-type stabilizer generators and that show phase-transition behavior akin to the 2D Ising model for both the X- and Z-type error sectors. This result doesn’t achieve a complete theoretical solution to the question of whether self-correcting quantum memories can exist in principle, as I’ll explain below, but it makes impressive progress using a mix of rigorous analysis and physical argument.

First, what do I mean by “physically realistic”? Well, obviously I don’t mean physically realistic (without quotes)—that’s a much greater challenge. Rather, we want to abstractly characterize some features that should be shared by a physically realistic implementation, but with enough leeway that a theorist can get creative. To capture this, Courtney introduces the so-called Caltech Rules for a self-correcting quantum memory.

The phrase “the Caltech Rules” is (I believe) attributable to David Poulin. Quantum memory aficionados have been debating these rules in emails and private discussions for the last few years, but I think this is the first time someone has put them in print. As rules, they aren’t really set in stone. They consist of a list of criteria that are either necessary or seemingly necessary to avoid models that are self-correcting for trivial and unphysical reasons (e.g., scaling the coupling strengths as a function of $n$). In Courtney’s version of the rules, we require a model with finite-dimensional spins (so no bosonic or fermionic models allowed… this might be objectionable to some people), bounded-strength short-range interactions between the spins, a constant density of spins, a perturbatively stable degenerate ground space for the encoded states, an efficient decoding algorithm, and an exponential memory lifetime against low-temperature thermal noise. One might wish to add even more desiderata like translation-invariant couplings or a spectral gap (which is closely related to stability), but finding a self-correcting memory subject to these constraints is already a tall order. For some more discussion on these points, check out another awesome paper that came on the arxiv yesterday, an excellent review article on quantum memories at finite temperature by Ben Brown et al..

To motivate the construction, it helps to remember everyone’s favorite models, the Ising model and the Toric code. When the temperature $T$ is zero, it’s easy to store a classical bit using the 1D Ising model; this is just a repetition code. Similarly, the 2D toric code can store quantum information at $T=0$. Both of these codes become unstable as memories at $T\textgreater 0$ because of the presence of string-like logical operators. The physical process by which these strings are created costs some energy, but then the strings can stretch and grow without any energy cost, and thermal fluctuations alone will create enough strings in a short time to cause a decoding failure. By contrast, the 2D Ising model can store a classical bit reliably for an exponential amount of time if you encode in the total magnetization and you are below the Curie temperature. The logical operators are now membranes that cost energy to grow. Similarly, the 4D toric code has such a phase transition, and this is because the X- and Z-type errors both act analogously to 2D Ising models with membranous logical operators.

Sierpinski carpet, with edges placed to form a “Sierpinski graph”.

The codes that Courtney defines are called embeddable fractal product codes (EFPC). The idea is that, if a product of two 1D Ising models isn’t a 2D self-correcting model, but a product of two 2D Ising models is a self-correcting memory, then what happens if we take two 1.5D Ising models and try to make a 3D self-correcting memory? The backbone of the construction consists of fractals such as the Sierpinski carpet that have infinite ramification order, meaning that an infinite number of edges on an associated graph must be cut to split it into two infinite components. Defining an Ising model on the Sierpinski graph yields a finite-temperature phase transition for the same reason as the 2D Ising model, the Peierls argument, which is essentially a counting argument about the density of domain walls in equilibrium with fixed boundary conditions. This is exactly the kind of behavior needed for self-correction.

Splitting the Sierpinski graph into two infinite components necessarily cuts an infinite number of edges.

Using the adjacency of the Sierpinski graph, the next step is to use a toric code-like set of generators on this graph, paying careful attention to the boundary conditions (in particular, plaquette terms are placed in such a way that the stabilizer group contains all the cycles that bound areas of the fractal, at any length scale). Then using homological product codes gives a natural way to combine X-like and Z-like copies of this code into a new code that naturally lives in four dimensions. Although the natural way to embed this code requires all four spatial dimensions, it turns out that a low-distortion embedding is possible with distortion bounded by a small constant, so these codes can be compressed into three dimensions while retaining the crucial locality properties.

Remarkably, this construction gives a finite-temperature phase transition for both the X- and Z-type errors. It essentially inherits this from the fact that the Ising models on the Sierpinski graph have phase transitions, and it is a very strong indication of self-correcting behavior.

However, there are some caveats. There are many logical qubits in this code (in fact, the code has constant rate), and only the qubits associated to the coarsest features of the fractal have large distance. There are many logical qubits associated to small-scale features that have small distance and create an exponential degeneracy of the ground space. With such a large degeneracy, one worries about perturbative stability in the presence of a generic local perturbation. There are a few other caveats, for example the question of efficient decoding, but to me the issue of the degeneracy is the most interesting.

Overall, this is the most exciting progress since Haah’s cubic code. I think I’m actually becoming optimistic about the possibility of self-correction. It looks like Courtney will be speaking about his paper at QIP this year, so this is yet another reason to make it to Sydney this coming January.

Posted in Quantum | 1 Comment

## A Breakthrough Donation for Computer Science

Lance Fortnow has a post summarizing some of the news affecting the CS community over the past month, including updates on various prizes as well as the significant media attention focusing on physics- and math-related topics such as movies about Turing and Hawking as well as Terrence Tao on the Colbert Report.

From his post, I just learned that former Microsoft chief executive Steven Ballmer is making a donation to Harvard that will endow twelve—that’s right, 12—new tenured and tenure-track faculty positions in computer science. This is fantastic news and will have a huge positive impact on Harvard CS.

One thing missing from Lance’s list was news about the Breakthrough Prizes in mathematics and fundamental physics. In case you’ve been living under a rock, these prizes give a very hefty US $3 million purse to the chosen recipients. The winners are all luminaries in their field, and it’s great to see them get recognition for their outstanding work. On the other hand, juxtaposing Ballmer’s donation and the Breakthrough Prizes couldn’t offer a starker contrast. It costs the same amount—$3 million—to endow a university full professor with appointments in more than one discipline at Duke University. My initial googling would suggest that this is a pretty typical figure at top-tier institutions.