Now that's what I call self correcting

credit: University of Illinois

Apparently this is the year for breakthroughs in self-correcting computer hardware. After hearing about Jeongwan Haah’s new self-correcting topological quantum memory at QIP, I just learned, via BBC news, about a new type of (classical) self correcting circuit: one which heals itself when one of the wires cracks! The full paper can be found here, and it is mostly quite readable. The basic idea is to start with a standard classical wire made of (for example) gold. The researchers sprinkled the wire with tiny capsules filled with a metal alloy (Ga-In) which is liquid at room temperature and has high conductivity. Then they bent the circuit board until it cracked, breaking the wire, and hence the circuit. Within milliseconds, the capsules also broke, the cracks filled with the liquid metal, and conductivity was restored. Self correcting, indeed!
One thing I didn’t understand is how the liquid metal stays in the cracks. I guess that at the scale they are working at, the surface tension alone is sufficient to keep the liquid metal in place?

QIP 2012 Day 5

The quantum pontiff brains have reached saturation.

Eric Chitambar, Wei Cui and Hoi-Kwong Lo:
Increasing Entanglement by Separable Operations and New Monotones for W-type Entanglement

These results demonstrate a large quantitative gap between LOCC and SEP for a particular task called random EPR distillation. Therefore, SEP may not always be a good approximation for LOCC. They also demonstrate entanglement monotones that can increase under SEP, the first known examples with analytic functions. They also show that LOCC is not a closed set of operations, so optimal LOCC protocols may not exist.
Recall how LOCC works: Alice and Bob share a bipartite pure state $latex |psirangle_{AB}$. Alice makes a measurement on her system and sends some classical bits to Bob. Bob then makes a measurement on his system and sends some bits to Alice. They repeat this many times. Any LOCC operation is a collection of maps $latex {mathcal{E}_j}$ such that the sum of the maps is trace preserving and each map has a separable Kraus decomposiiton where each operator can be build from successive rounds of local measurements. This is a pretty difficult condition to study, so it is convenient to relax LOCC to SEP. For SEP, we drop the “successive rounds of local measurements” requirement on the Kraus operators. Given an arbitrary SEP, we can always implement it with LOCC if we allow for some probability of failure, i.e. SLOCC. Can we eliminate this probability of failure? Not in general. There are maps that are in SEP but not in LOCC, as first demonstrated by [Bennett et al.] (“Quantum nonlocality without entanglement”), and subsequently investigated by many other authors.
So now we know that SEP and LOCC are not equal, but how non-equal are they? That is, can we quantify the gap between the two classes? There is some previous work, such as Bennett et al., who showed that the mutual information for state discrimination had a gap of at least $latex 10^{-6}$, and work by Koashi et al. who showed that there is a gap in the success probability for unambiguously distinguishing states was at least 0.8%. These gaps look rather small, so you might plausibly conjecture that SEP is a good approximation for LOCC in general.
SEP is precisely the class of operations that cannot create entanglement out of nothing, but if we seed things with a little bit of entanglement, can we use SEP to increase the entanglement? We need to define some LOCC monotone to make this a meaning statement.
Surprisingly, yes! There are entanglement transforms that work in SEP but not LOCC, and therefore entanglement monotones (but artificial ones) that can increase under SEP but not LOCC. To give some intuition, though, here is a non-artificial task.
Random-party distillation of W-class states:
An N-partite W-class state looks (up to local unitary rotation) like
$latex |vec{x}rangle = sqrt{x_0} |00ldots 00rangle + sqrt{x_1} |10ldots 00rangle + sqrt{x_2} |01ldots 00rangle + sqrt{x_n} |00ldots 01rangle$.
It’s a good class of states to study because they are only parameterized by N real numbers (as opposed to 2N), it is easy to characterize how the states transform under local measurements, and it is closed under SLOCC. See this paper by Kintas and Turgut for a review of the properties of W-class states.
Here is a nice example, due to Fortescue and Lo. Start with a tripartite W state, and have each party perform the measurement $latex {M_0, M_1}$, with $latex M_0 = sqrt{1-epsilon}|0ranglelangle 0|+|1ranglelangle 1|$ and $latex M_1 = sqrt{epsilon}|0ranglelangle 0|$, then broadcast the result. If the outcomes are 0,0,0, then nothing happens. If the outcomes are 1,0,0 or 0,1,0 or 0,0,1, then the two parties measuring 0 are left with an EPR pair; hence this achieves random-party EPR distillation. If there are two or three $latex M_1$ outcomes, then the entanglement is lost. However, as $latex epsilonrightarrow 0$, then the probability of this happening goes to zero, while the number of rounds go up. Intriguingly, this is evidence that the set of LOCC operations is not closed (i.e. does not contain all of its limit points), but previously that was not proven. Of course, we can also generalize this to the N-partite setting.
This can be generalized by using local filtering to first (probabilistically) map a W-class state to one with $latex x_1=cdots=x_N$. In fact, the resulting probability of success is optimal (see paper), and thus this optimal probability of EPR distillation is an entanglement monotone.
The fact that they establish an entanglement monotone means they get an awesome corollary. There is a parameter $latex kappa$ which represents the success probability of distillation of an EPR pair involving alice. They give an explicit formula for it, and prove that it decreases for any measurement made by Alice that changes the state. Thus, they prove that:

LOCC is not closed!

Here is a great open question (not in the talk): Find a similar monotone that describes data hiding, for example to improve the analysis of these states.
For the multipartite setting, there are a few more ideas. There is a single-copy “combing transformation” (analogous to the one of Yang & Eisert), which transforms all the entanglement to bipartite entanglement shared with Alice. Again SEP is better than LOCC, in ways that can be quantified.
Some open problems:

  • What about the case of $latex x_0 not= 0$ W-states? They have some partial results, but it still remains open.
  • Can the gap between SEP and LOCC be increased arbitrarily?
  • Can one apply these ideas to other entanglement classes?
  • Do there exist similar phenomena in bipartite systems?
  • How much entanglement is required to implement SEP operations?

Rodrigo Gallego, Lars Erik Würflinger, Antonio Acín and Miguel Navascués:
Quantum correlations require multipartite information principles
(merged with)
An operational framework for nonlocality

Normally, we define “3-partite entanglement” to mean states that are not separable with respect to any bipartition; equivalently, they cannot be created by LOCC even if two parties get together. But this definition can be dangerous when we are considering nonlocality, as we will see below.
Nonlocality is a resource for device-independent information protocols. Define a local joint probability distribution to be one which satisfies $latex P(ab|xy) = sum_lambda p(lambda) P(a|x) P(b|y) $.
Nonlocality is often described in terms of boxes that mimic measurements on entangled states. But entanglement can also be manipulated, e.g. by LOCC. What is the analogue for boxes? Are their variants of entanglement swapping, distillation, dilution, etc? In most cases, when we make the boxes stronger, the dynamics get weaker, often becoming trivial.
They define a new class or operations called WPICC, which stands for something about rewirings and pre- and post-processing (note that non-local boxes can be rewired in ways that depend on their outputs, so their causal structure can be tricky). WPICC are the operations which map local joint probability distributions to local joint probability distributions, and shouldn’t be able to create nonlocal correlations; indeed, we can use them to define nonlocal correlations, just as entanglement is defined as the set of states that can’t be created by LOCC.
However, with these operations, operations on two parties can create tripartite nonlocality, so this approach to defining nonlocality doesn’t work.
Instead, define a box to be tripartite nonlocal if it doesn’t have a TOBL (time-ordered bilocal) structure, meaning a Markov-like condition that’s described in the paper.
Moral of the story? If you want a sensible definition of tripartite correlations, then beware of operational definitions analogous to LOCC, and focus on mathematical ones, analogous to SEP.

Martin Schwarz, Kristan Temme, Frank Verstraete, Toby Cubitt and David Perez-Garcia:
Preparing projected entangled pair states on a quantum computer

There are lots of families of states which seem useful for giving short classical descriptions of highly entangled quantum states. For example, matrix product states are provably useful for describing the ground states of gapped one-dimensional quantum systems. More generally there are other classes of tensor network states, but their utility is less well understood. A prominent family of states for trying to describe ground states of 2d quantum systems is projected entangled pair states (PEPS). These are just tensor network states with a 2d grid topology. The intuition is that the structure of correlations in the PEPS should mimic the correlation in the ground state of a gapped system in 2d, so they might be a good ansatz class for variational methods.
If you could prepare an arbitrary PEPS, what might you be able to do with that? Schuch et al. proved that an oracle that could prepare an arbitrary PEPS would allow you to solve a PP complete problem. We really can’t hope to do this efficiently, so it begs the question: what class of PEPS can we prepare in BQP? That’s the central question that this talk addresses.
We need a few technical notions from the theory of PEPS. If we satisfy a technical condition called injectivity, then we can define (via a natural construction) a Hamiltonian called the parent Hamiltonian for which the frustration-free ground state is uniquely given by the PEPS. This injectivity condition is actually generic, and many natural families of states satisfy it, so intuitively it seems to be a reasonable restriction. (It should be said, however, that there are also many natural states which don’t satisfy injectivity, for example GHZ states.)
The algorithm is to start from a collection of entangled pairs and gradually “grow” a PEPS by growing the parent Hamiltonian term-by-term. If we add a term to the Hamiltonian, then we can try to project back to the ground space. This will be probabilistic and will require some work to get right. Then we can add a new term, etc. The final state is guaranteed to be the PEPS by the uniqueness of the ground state of an injective parent Hamiltonian.
In order to get the projection onto the new ground state after adding a new term to the Hamiltonian, we can use phase estimation. If you get the right measurement outcome (you successfully project onto the zero-energy ground space), then great! Just keep going. But if not, then you can undo the measurement with the QMA amplification protocol of Marriott and Watrous.
The bound on the run time is governed by a polynomial in the condition number of the PEPS projectors and the spectral gap of the sequence of parent Hamiltonians, as well as the number of vertices and edges in the PEPS graph.
This can also be generalized to PEPS which are so-called G-injective. This condition allows the method to be generalized to PEPS which have topological order, where the PEPS parent Hamiltonian has a degenerate ground space.
Good question by Rolando Somma: What happens if the ground state is stoquastic? Can you get any improvements?

Esther Hänggi and Marco Tomamichel:
The Link between Uncertainty Relations and Non-Locality

The main result: Nonlocality, which means achieving Bell value $latex beta$,
implies an uncertainty relation, namely, the incompatible bases used in the Bell experiment have overlap at most $latex c^*=f(beta)$.
This is a sort of converse to a result of Oppenheim and Wehner.
This has some practical implications: the maximum basis overlap is important for things like QKD and the security of noisy storage, and this result implies that it can be tested robustly by observing a Bell inequality violation.
The kind of uncertainty relation we are interested in is of the form $latex H(X|B) + H(Y|C) geq -log_2(c)$, and indeed because this is ETHZ work, we should expect a smoothed min-entropy version as well: $latex H^epsilon_{max}(X|B) + H^epsilon_{min}(Y|C) geq -log_2(c)$.
Actually, we need a variant to handle the case that both bases contain a “failure” outcome. This work replaces the maximum overlap $latex c$ (at least in the case of the BB84 bases) with $latex c^* = frac{1+epsilon}{2}$, where $latex epsilon$ is the probability of the failure outcome. (See the paper for more general formulation.)
This parameter $latex c^*$ is related to the maximum CHSH-type value $latex beta$ via a nice simple formula. Unlike some previous work, it is somewhat more “device-independent”, not requiring assumptions such as knowing that the systems are qubits. $latex beta$ in turn, we can relate to an entropic uncertainty relation.

Salman Beigi and Amin Gohari:
Information Causality is a Special Point in the Dual of the Gray-Wyner Region

What is information causality? Essentially the idea that the bound on random access coding should apply to general physical theories in a nonlocal setting. More precisely, Alice has independent bits $latex a_1,ldots,a_N$, Bob has $latex b in [n]$, Alice sends a message $latex x$ to Bob which becomes part of Bob’s state $latex beta$, and Bob tries to guess $latex a_b$. The bound is $latex H(x) geq sum_{i=1}^N I(a_i; beta|b=i)$. This holds classically because Bob can guess $latex a_1$ without giving up his ability to guess bits $latex a_2,ldots,a_n$. So he can keep guessing all the others and his mutual information just adds up; but it cannot ultimately add up to more that $latex H(x)$.
If you look at the details, the key ingredients are the consistency of mutual information (that is, it accurately represents correlations), data processing and the chain rule. These all apply to quantum states as well (hence the quantum random access bound).
Overview of the talk

  1. connection to Gray-Wyner problem
  2. generalizing information causality
  3. connecting to communication complexity.

The Gray-Wyner region is defined as the set of rates $latex (R_0,ldots,R_n)$ such that there exists a random variable e with $latex R_0 ge I(a;e)$ and $latex R_i ge H(a_i|e)$ for $latex i=1,ldots,N$.
Thm: For any physical theory satisfying the above “key ingredients” plus AMI (accessibility of mutual information, to be defined later), the point
$latex (H(X), H(a_1|beta_1, b=1), ldots, (H(a_N|beta_N,b=N))$ belongs to the Gray-Wyner region.
AMI basically means that the mutual information satisfies something like HSW. There is an issue here, which the authors are looking into, with whether the Holevo information or the accessible information is the right quantity to use.
This theorem means that the information causality argument can be generalized: using the same assumptions, we also obtain all of the inequalities that are known to apply to the Gray-Wyner region. Hence the reference in the title to the dual of the Gray-Wyner region; the dual consists of all linear inequalities that are satisfied by the entire Gray-Wyner region.
This improves our understanding of these bounds. It also gives some new lower bounds on the communication cost of simulating nonlocal correlations.

Marcus P. Da Silva, Steven T. Flammia, Olivier Landon-Cardinal, Yi-Kai Liu and David Poulin:
Practical characterization of quantum devices without tomography

Current experimental efforts at creating highly entangled quantum states of many-body systems and implementing unitary quantum gates quickly run into serious problems when they try to scale up their efforts. The dimensionality of the Hilbert space is a serious curse if your goal is to characterize your device using full tomography. For example, the 8-qubit experiment that was done by Häffner et al. in 2005 took a tremendous number of measurements and many many hours of classical post-processing.
But what if you could avoid doing tomography and still get a good characterization of your device? Perhaps you are only interested in one number, say, the fidelity. Can you directly compute the fidelity without going through tomography?
Yes! And you can do it much more efficiently if you are trying to produce a pure quantum state or a unitary quantum gate.
We’ll focus on the case where the target state is a pure state for simplicity. But everything carries over directly to the case of unitary quantum gates with only minor modifications.
The fidelity between your target state $latex psi = |psiranglelanglepsi|$ and the actual (arbitrary) state $latex rho$ in the lab is given by $latex F= mathrm{Tr}(psi rho)$.
If you expand the expression for F in everybody’s favorite basis, the Pauli basis, then you get
$latex F = sum_i mathrm{Pr}(i) x_i$ where $latex mathrm{Pr(i)} = mathrm{Tr}(psi hatsigma_i)^2/2^n$ is a probability distribution which depends only on the target state and
$latex x_i = mathrm{Tr}(rho hatsigma_i)/mathrm{Tr}(psi hatsigma_i)$ is something which can be estimated in an experiment by just measuring $latex hatsigma_i$. (We could expand in any orthonormal operator basis, or even a tight frame; the Pauli basis is just to be concrete and experimentally accessible.) But this is just an expectation value of a random variable, $latex = mathbb{E}(x)$, and moreover the variance is small, $latex mathrm{Var}(x) le 1$. To estimate the fidelity, we just sample from the probability distribution and then estimate $latex x_i$ and then average over a bunch of iid samples. We only need $latex O(1/epsilon^2)$ different observables to get an estimate to within additive error $latex pm epsilon$.
There are a couple of caveats. Sampling from the relevance distribution can in general be a hard problem for your classical computer (though you only have to sample a constant number of times). And the number of repeated measurements that you need to estimate the expectation value might have to scale with the dimension of the Hilbert space.
But there is some good news too. We get a lower bound on tomography which is $latex tildeOmega(d^2)$ for the sample complexity using Pauli measurements for pure states, whereas the direct fidelity estimation protocol uses only $O(d)$ samples. Moreover, there are lots of classes of states which avoid the dimensional scaling, like stabilizer states, Clifford gates, or W-states. For these states and gates, the entire procedure is polynomial in the amount of time and the number measurements.

Robin Blume-Kohout:
Paranoid tomography: Confidence regions for quantum hardware
(merged with)
Matthias Christandl and Renato Renner:
Reliable Quantum State Tomography

We consider the setting of a source of quantum states and some measurements, followed by the reconstruction of the quantum state or process. How can we get useful and reliable error bars on the reconstruction?
One problem that you might have when doing a naive linear inversion of a state estimate is that you could have negative eigenvalues on the estimate, which is non-physical. How might you deal with these? One way to do that is to use maximum likelihood estimation (MLE). How might you get an error bar? You could use “bootstrapping”, which is to resample data from the initial estimate to try to estimate the variance. But this can be unreliable: you can even construct counterexamples where it fails miserably! Is there a better way? You could try Bayesian update, where you begin with a prior on state space and then computer a posterior and an error bar. Now everything that you report will depend on the prior, which might be undesirable in e.g. a cryptographic setting. So we need a reliable way to report error bars.
There are some existing methods for producing error bars (e.g. compressed sensing, MPS tomography, and others), but here we are interested in systematically finding the optimal confidence region for an estimate, one that is adapted to the data itself. It is beyond the scope of this work currently to consider also the notion of efficiency; this is an open question.
The main result: They derive region estimators such that the true state is contained in the region with high probability (over the data), and where the size of the region is, in a certain sense, minimal.
Question: how do these relate to the classical statistical notions of confidence region and Baysian credible interval? Tell us in the comments!
The authors use two different techniques to achieve these results. RBK’s results use a likelihood ratio function in the setting of iid measurements. The region is the set of states for which the likelihood function is larger than some threshold $latex epsilon$ which was chosen a priori. The proof involves a classical statistical analysis. The construction by MC and RR is defined as the region in state space which contains the MLE with high probability with respect to Hilbert-Schmidt measure, but enlarged by a tiny factor. The proof in this case uses a postselection technique for quantum channels.
Here Matthias wants to give a new proof that was custom-made for QIP! It uses likelihood ratio regions for general measurements in a new way that all three authors came up with together. We can credit QIP with the result because the program committee forced Matthias+Renato to merge their talk with Robin Blume-Kohout’s. But the two papers had different assumptions, different techniques and different results! After discussing how to combine their proofs for pedagogical reasons (there wasn’t time to present both, and it also wouldn’t be so illuminating), they realized that they could use Matthias + Renato’s assumptions (which are more general) and Robin’s method (which is simpler) to get a result that is (more or less) stronger than either previous result.

Sandu Popescu:
The smallest possible thermal machines and the foundations of thermodynamics

Are there quantum effects in biology?
Biological systems are open, driven systems far from equilibrium.
Sandu: “In fact, I hope it is a long time before I myself reach equilibrium.”
This talk is somewhat classical, but it does talk about the physics of information, at small scales.
He talks about very small refrigerators/heat engines. It turns out there is no (necessary) tradeoff between size and efficiency; one can get Carnot efficiency with a three-level (qutrit) refrigerator, which is simultaneously optimal for both size and efficiency.
It was a great talk, but your humble bloggers had mostly reached their ground state by this point in the conference… see the photo at the top of the post.

QIP 2012 Day 4

Possible location of QIP 2014??

Aleksandrs Belovs
Span Programs for Functions with Constant-Sized 1-certificates, based on 1105.4024

Consider the problem of finding whether or not a triangle exists in a graph with n vertices. Naively using Grover’s algorithm requires time $latex n^{3/2}$. But a series of improvements have brought this down to $latex n^{10/7}$ (using Grover with better combinatorics) and $latex n^{13/10}$ (using element distinctness in a clever way). The current paper reduces this run time to $latex n^{35/27}$. That is, the exponent is improved from 1.3 to 1.296296296…
Before you roll your eyes and move to another tab, hear me out. The real improvement is in the technique, which is radically new, and has applications to many problems; not only triangle-finding.
The idea is to construct something called a learning graph. Each vertex is a subset of [n], and (directed) edges are weighted, and correspond to (sets of) queries. These are like randomized decision trees except they can have loops.
The complexity of the resulting algorithm (which we’ll describe below) is a little tricky to define. There is a “negative complexity” which is simply $latex sum_e w_e$. And a positive complexity which involves maximizing over all yes inputs $latex x$, and then minimizing over all flows of size 1 where the source is the empty set and the sinks are the 1-certificates for $latex x$.
Algorithms in this model correspond to flows of intensity one. The vertex corresponding to the empty set is the root, and 1-certificates (i.e. proofs that x is indeed a yes instance) are the sinks. The cost is $latex sum_e p_e^2/w_e$, where $latex p$ denotes the flow.
From a learning graph, there is a recipe to produce a span program, and in turn a quantum query algorithm, with the query complexity equal to the geometric mean of the negative and positive complexities of the learning graph.
As an example, consider the OR function. Here the learning graph is comprised entirely of a root and n leaves. Weight each edge by 1, so the negative complexity is n. If there are k solutions, then the resulting flows are 1/k on each edge, and so the positive complexity is 1/k. The algorithm’s complexity is $latex sqrt{n/k}$. Phew!
Less trivial is element distinctness:
Are there two indices $latex ineq j$ such that $latex x_i=x_j$?
Again the algorithm achieves the optimal complexity, which is $latex O(n^{2/3})$, matching Ambainis’s algorithm.
There are lots of new applications too, to improving the triangle runtime, but also getting a bunch of polynomial speedups for other problems that either are, or resemble, subgraph containment. For example, it improves some of the results from 1011.1443, and gets linear time algorithms to check whether a graph contains subgraphs from certain families, like lines, T-shaped things, and lines that go through stars.

Francois le Gall,
Improved output sensitive quantum algorithms for Boolean matrix multiplication

Boolean matrix multiplication means that the matrices are binary, + is replaced by OR, and * is replaced by AND. This arises in many cases, often related to graph theory, such as computing the transitive closure of a graph, triangle-detection, all-pairs shortest path, etc. Also it has application to recognizing context-free languages.
The naive algorithm is $latex O(n^3)$. Reducing to integer multiplication gives $latex O(n^{2.3ldots})$. Quantumly, we can use Grover to compute each entry, for a total runtime of $latex O(n^{2.5})$.
But suppose C has only one nonzero entry. Then we can do recursive Grover search and get time complexity $latex O(n^{1.5})$ [Buhramn Spalek ‘05].
If it has $latex l$ nonzero entries, we get $latex O(n^{1.5}) sqrt l$, neglecting polylogs.
The parameter l represents the sparsity of matrix “output sensitive algorithms”.
In this work, the runtime is improved to $latex O(n^{3/2} + nl^{3/4})$. There is also a more complicated result for query complexity. The techniques are adapted from papers of Williams & Williams, and Lingas. They also make use of a wide range of polynomial speedup tricks: Grover, approximate counting, triangle-detection, etc. The talk leaves the exciting impression that we not yet finished seeing improvements to these techniques.

Richard Cleve Discrete simulations of continuous-time query algorithms that are efficient with respect to queries, gates and space, joint work with Dominic Berry and Sevag Gharibian.

After he was introduced, Richard sheepishly admitted, “The title isn’t really that efficient with its use of space.”
Normally the oracle for a quantum query algorithm defines the oracle query by flipping the phase of certain bits. In the continuous-time setting, we can get other phases. So, for example, a ”half-query” would give $latex Q |x rangle = i^{x_j} | x rangle$, since i is a square root of minus one. In this model, we only charge 1/2 of a query for this oracle call. More generally, we can have arbitrary fractional queries. Then we can take a limit and get to continuous time to find $latex expbigl(i(H_U + H_Q)tbigr)$ where $latex H_U$ is the driving (gate) Hamiltonian and $latex H_Q$ is a query Hamiltonian.
It is essential to be able to translate from the continuous-time model back to the discrete-time model for algorithmic applications. (cf. The work on NAND trees.) Is there a systematic way to do this translation? Yes, CGMSY-M `09 (improved by LMRSS `11) showed that the query complexity can be directly translated with the same number of queries. But these constructions are not gate efficient. Can we improve the gate complexity of the translation?
We have to be careful because the gate complexity will depend on the complexity of the driving Hamiltonian, so we have to carefully define a notion of approximately implementable Hamiltonian. If we have an epsilon-approximate implementation, and a driving Hamiltonian H, then the number of gates is at least $latex G ge log (|H|)/epsilon$, where the norm is the operator norm of the Hamiltonian (but modulo some details about the global phase.)
Main results: We can translate from the continuous-time setting to the discrete-time setting by using:

$latex O(T log T / log log T)$ queries
$latex O(T G log^2 TG)$ gates
$latex O(log^2 TG)$ qubits of ancilla space

Miklos Santha, Hidden Symmetry Subgroup Problems, joint work with Thomas Decker, Gábor Ivanyos and Pawel Wocjan

First let’s observe an interesting connection. In the dihedral group $latex D_{2p}$ for a prime p, the two-element subgroups generated by reflections coincide with the symmetries of the level sets of quadratic polynomials over $latex mathbb{F}_p$.
Can we more generally study subgroups which are hidden by symmetries? This generalizes and connects many problems in quantum algorithms, e.g. the hidden subgroup problem (HSP) over non-abelian groups, non-linear hidden structures and hidden polynomial problems among others, as the above example demonstrates.
Recall that the HSP is defined as follows. There is a function (for which you have an oracle) which is constant and distinct on the cosets of a subgroup of some group G. You want to obtain a generating set for the subgroup.
The hidden symmetry subgroup problem (HSSP): the input is a group G, a group action $latex circ$ which acts on a set M, a closed set of subgroups $latex mathcal{H}$, and there is an oracle function $latex f:M to S$, such that (we’re promised) for some subgroup $latex H in mathcal{H}$, $latex f(x) = f(y) leftrightarrow H circ x = H circ y$ for $latex x,y in M$. You want to output this (unique) H.

Rahul Jain and Ashwin Nayak:
A quantum information cost trade-off for the Augmented Index

Consider a twist (due to Yao ‘82) on the usual communication complexity problem. Alice has x, Bob has y, and they both want to compute f(x,y). But this time, they don’t want to leak (to the other party) more information than is already expressed by the answer, e.g. the millionaire problem.
Start with the usual example: equality testing. There is a O(log n) one-way protocol with 1/poly(n) error, and reveals only O(1) bits about one input.
Today’s talk is about:
Augmented index: Alice has $latex x_1, ldots, x_n$. Bob has $latex k, x_1,ldots, x_{k-1}, b$.
They want to know whether $latex b=x_k$.
(This has many connections to other works in theoretical computer science, not all of which are obviously communication-related. See paper for detail.)
Main result: For any quantum protocol which computes the augmented index with probability $latex 1-epsilon$ on the uniform distribution, we must have either: 1) Alice reveals $latex Omega(n/t)$ bits of info about x, or 2) Bob reveals $latex Omega(1/t)$ about k, where t is the number of messages. This is true even when restricted to 0-inputs.
There is also an earlier classical result by the same authors: Alice reveals $latex Omega(n)$ or Bob reveals $latex Omega(1)$. If there are only two messages, then the lower bound for Bob is improved to $latex Omega(log n)$ (which is optimal), even when restricted to 0-inputs.
At this point, Ashwin, apparently reading the audience’s mind, puts up a slide that is not quite as pertinent as Josh Cadney’s opening question, but is still what we are all thinking:
“Why Augmented Index? Why privacy w.r.t. 0-inputs?”
Answer: Suppose we care about a streaming algorithm, which for quantum is relevant because we might only have O(1) qubits for the foreseeable future. There are exponential space improvements in this model, but let’s just say they don’t look like the sort of problems that Cisco was dying to implement on their routers. Augmented Index, by contrast, is much more natural. One semi-related version is Dyck(2), which is determining whether a sequence of (,),[,] is properly bracketed (this is a special case of recognizing context-free languages). Here we observe a big separation between one and two passes over the input (if the second pass goes in reverse order), and the lower bound on the (classical) one-pass complexity is based on (classical) lower bounds for the augmented index problem.
Having motivated the question, Ashwin is left with “-2 minutes” to finish his talk. Talk about a cliffhanger! Now he’s guaranteed to have the proof accepted to QIP 2013 (although he did give away the punchline by stating the theorem at the start of his talk). Moreover, the chair apparently has a sign for -2 minutes left, since he didn’t say anything, he just held up a sign. I guess staying on time is not a strong point in our community. 🙂
Ashwin is now ruining his cliffhanger, giving an intuitive and cute overview of the proof! He claims that this part of the talk is “in reverse.”

Sevag GharibianHardness of approximation for quantum problems, joint work with Julia Kempe

Finding the ground state energy of a local Hamiltonian is QMA-complete, and this is harder than NP, because, um, there’s quantum in there too, and well, it seems harder? Sev wants to say something more concrete here. We’ve given up on solving local Hamiltonian exactly in general, because we’ve already given up on solving NP on a quantum computer (mostly). But surely there are good approximation algorithms for these things. Right? Well, today, we’ll talk about hardness of approximation, thus casting complexity-theoretic cold water on seemingly less ambitious goals.
All we know about approximating the local Hamiltonian problem:

  • PTAS on planar graphs
  • $latex 1/d^{k-1}$-approximation algorithm for dense graphs with k-local Hamiltonians.

This talk takes a step away from QMA, and considers instead the complexity class cq-$latex Sigma_2$, which is the set of problems that can be expresses as “there exists x, such that for all $latex |psirangle$, a poly-time verifier V accepts $latex (x, |psirangle)$.
Their main result is to show that the following problem is complete for cq-$latex Sigma_2$, and the hardness part holds even if we demand only approximate solutions.
The quantum succinct set cover problem: Given a set of 5-local Hamiltonians acting on n qubits, and a parameter $latex alpha>0$, and the promise that
$latex sum_{iin S} H_i succeq alpha I$,
find the smallest subset $latex S’subseteq S$ such that
$latex sum_{iin S’} H_i succeq alpha I$.
The diagonal case was shown to be hard to approximate by Chris Umans in 1999.
For the technical details, there are lots of nice tools, and you’ll just have to look at the paper (when it comes out…).

Jérémie Roland,Quantum rejection sampling, joint work with Maris Ozols and Martin Roetteler

Let’s first review classical rejection sampling, which is a tool that you can use to sample from arbitrary probability distributions. It has numerous applications: Metropolis sampling, Monte Carlo algorithms, optimization algorithms (e.g. simulated annealing) and many others. In the quantum case, we wish to replace probabilities by amplitudes.
Consider two probability distributions P and S. Suppose we can sample from P repeatedly, but we wish to sample from S. Define $latex Pr[{rm accept} k] = gamma s_k/p_k$ and continue to sample until you accept. Here we will choose the parameter $latex gamma$ in a way to try to minimize the expected amount of sampling from P that we need to do. The best way to do it is to choose $latex gamma = min_k p_k/s_k$. Then the expected number of samples from P to get a sample from S is $latex 1/gamma$.
Now for the quantum case. We are given access to a black box which prepares a state $latex |pi^xirangle = sum_k pi_k |xi_krangle |krangle$ where the $latex pi_k$ are known amplitudes. We want to prepare the state $latex |sigma^xirangle = sum_k sigma_k |xi_krangle |krangle$ in terms of some new amplitudes. What is the randomized query complexity of this?
Instead of computing a function, which is the normal thing that we use oracles for, we wish to generate a quantum state. There is an oracle, which is a unitary that hides the label $latex xi$ in a non-explicit way. The action of the oracle is $latex O_xi |0rangle = sum_k pi_k |xi_krangle |krangle$.
Let’s define quantum state preparations:
We have a set of quantum states $latex Psi = {psi_xi}$
set of oracles $latex O = { O_xi }$
Goal: generate $latex psi_xi$ given black box access to $latex O_xi$
What’s the minimum number of calls to the oracle to get the correct state (up to a coherent error with amplitude $latex sqrt{epsilon}$)?
We introduce a coherent coin.
$latex O_xi |0rangle to sum_k pi_k |xi_krangle |krangle (sqrt{ pi – alpha})|0rangle + alpha|1rangle)$
This rotated ancilla is like a classical coin, but now we have a superposition. If we measure the ancilla then we accept if we get one.
Using amplitude amplification, we can get a speedup over the naive accept probability.
We can also optimize over the rotation angle to get the optimum. This can be done efficiently since it’s an SDP. They can prove that this is an optimal algorithm. [photo 70]
Now for the applications. Linear systems of equations: the HHL algorithm basically just does phase estimation followed by quantum rejection sampling. [photo 71] The quantum Metropolis algorithm is a way to prepare a Gibbs state of H at inverse temperature beta.
Direct mapping of Metropolis MRRTT algorithm OK for diagonal Hamiltonians, but if Hamiltonian is not diagonal we want to avoid the work of diagonalizing it.
One of the reasons that the Q. Metropolis algorithm was difficult was that the “reject” moves required that you go back to a state that you just measured. You can avoid having to deal with this difficulty if you use quantum rejection sampling. One can amplify the accepted moves and therefore avoid having to revert moves at all!
Boolean hidden shift. f(x+s) = f(x) find s in Z_2^n. If the function is a delta function, then you can basically only do Grover. You can do bent functions (those with flat Fourier spectrum) with only 1 query [Roetteler 10]. These are the extreme cases, but what happens when we interpolate between them? We can use quantum rejection sampling to analyze this problem.
Other applications: Amplify QMA witnesses [MW05, NWZ09], prepare ground states of PEPS parent Hamiltonians on a quantum computer [STV11]. [photo 76]

Rolando Somma, Spectral Gap Amplification, joint work with Sergio Boixo

Preparation of eigenstates is a central problem and fast quantum methods for computing expectation values of observables leads to speedups for many problems. Let’s begin by considering adiabatic state transformations. The run time is governed by the spectral gap of the interpolating Hamiltonian. The generic cost C(T) of a quantum algorithm that prepares the eigenstate depends on the inverse power of the spectral gap.
given H with eigenstate psi and gap Delta, can we construct an H with same eigenstate psi but a larger gap? (without cheating by multiplying by some constant?) We want to consider that the Hamiltonian is a sum of terms which are, say, local. We have a black box which on input k, s evolves according to $latex e^{-i s H_k}$. How many times do we need to use this box? We require that the cost of evolving with H’ is the same.
They achieved a quadratic gap amplification for the following case: if H is frustration free, then the new gap is $latex Omega(sqrt{Delta/L})$ where L is the number of terms in the Hamiltonian. In particular, when the spectral gap is very small in terms of L this gives an improvement. It turns out this amplification is optimal.
Question: What prevents you from iterating amplification process?
Ans. The new Hamiltonian isn’t FF, so you can only do the amplification once.
S. Flammia: Optimality of results?
Ans. There may be better constructions with respect to the scaling in L.

Rajat Mittal, Quantum query complexity for state conversion, joint work with Troy Lee, Ben Reichardt, Robert Spalek and Mario Szegedy

We live in a quantum world, so why are we always trying to solve classical tasks, like computing f(x), for x specified by an oracle? How about trying to prepare $latex |psi(x)rangle$ given x specified by an oracle? Or even more generally, converting $latex |psi_1(x)rangle$ into $latex |psi_2(x)rangle$? To make things simple, let’s focus on creating $latex |psi(x)rangle$, although the results apply to more general state conversion problems as well. The difficulty of doing this is defined entirely by the Gram matrix $latex G_{x,y} = langle psi(x)|psi(y)rangle$.
Previous work on adversary bounds gave a tight characterization of quantum query complexity for boolean functions. The present work extends the result to general functions, and indeed to state conversion.
$latex gamma_2(G-J|Delta)$
Ok, what is this thing $latex gamma_2(M|Delta)$?
Without the $latex Delta$, we define $gamma_2(M)$ to be the minimum over factorizations $latex {u_x}$, $latex {v_y}$ (meaning that $latex M_{x,y} = langle u_x, v_yrangle$ of $latex max |u_x|,|v_y|$
w.r.t. $latex Delta$; it’s similar, but with $latex M_{x,y} = langle u_x, v_yrangle N_{x,y}$.
e.g. $latex gamma_2(M|J) = gamma_2(M)$ and $latex gamma_2(M|M)=1$.
This gives some nice corollaries, e.g. that function composition does what you’d expect to query complexity.

Fernando Brandao, Random quantum circuits are unitary polynomial designs, joint work with Aram Harrow and Michal Horodecki

Previously blogged here.
Questions:
Patrick Hayden: To fool a circuit takes more time than the circuit does. Can you introduce some kind of restriction to make it go in a more favorable direction?
Aram: We could try to fool quantum Turing machines, but I don’t think our techniques say much that’d be helpful there.
Renato: Could you replace a random circuit by a random Hamiltonian?
Ans. Interesting question. Hayden et al do something like that. More interesting is to…

BUSINESS MEETING

The headline stat: 42 papers were accepted out of 198 submitted, meaning a 21% acceptance rate. (Louis also said the “real” rate was 29%.)
There were 485 reviews, of which 104 are from 72 external reviewers.
There were 198 = 144 (early) + 54 (late) poster submissions.
Of the 42 talks, there were 4 plenary, 9 long, 27 regular and 2 merged.
The budget breakdown is: rental+technical 55k, food+drinks 90k, personnel 30k, student support 10k for a of total 185k.
There were 259 participants of which 113 students.
Thus conference fees were 105k and there was 85k from sponsors. This leaves a surplus of 5k, presumably to cover damages during the rump session.
For the locals, the conference booklet explains how to get to the rump session. You take the metro, and as Louis said, “you will visit a lot of stations.” Note that this post will go online after the rump session.
Also, where is QIP 2013???
BEIJING! (specifically, Tsinghua University)
Local organizers are: Andy Yao (general chair), Amy Wang (organizing chair), Giulio Chiribella, Luming Duan, Kihwan Kim, Jian Li, Yaoyun Shi, Shengyu Zhang.
There is a new CQI at Tsinghua. You can guess what it stands for. Note that few permutations of those three letters remain.
Tentative dates are: Jan 21-25 or Jan 28-Feb 1, 2013.
A truly open question is: where will QIP 2014 be?
Two tentative proposals on the table are IBM Yorktown Heights and Barcelona.
But volunteers are still welcome! If you want the experience of hosting QIP, send an email to Louis Salvail soon.
Possible locations include a science museum, a university campus and… La Pedrera? Seriously? (See picture at the top of the post.)
Then there was general discussion about QIP.
Suggestions include

  • not posting the long versions (at least without the authors confirming)
  • Renaming QIP Workshop -> QIP Conference.
  • Getting rid of the 3-page submissions? Here was the heated discussion. There always has to be a heated discussion about something.
  • Dec vs Jan? The audience seemed to overwhelmingly prefer January (among those who cared). And yet here they are, all here in December!

QIP 2012 Day 3

PSPACE is a black hole gobbling up the other complexity classes.
And you thought the LHC was dangerous!

Troy Lee and Jérémie Roland:
A strong direct product theorem for quantum query complexity

Suppose you want to solve k independent instances of a problem with k-fold as many resources. For example, say you want to juggle 4 balls with two hands vs 2 balls with one hand.
A Strong Direct Product Theorem (SDPT) says (roughly) the following:
If we require r resources to solve one instance of a problem with success probability p, then using fewer than kr resources to solve k instances cannot succeed with probability greater than pk.
Intuitively, it ought to hold in many situations, but has been proven to hold only in a few. Some applications of SDPT:

  • Making a hard function really hard, e.g. Yao’s XOR lemma
  • Amplify soundness of a protocol, e.g. Raz

Here they study Quantum Query Complexity, which uses a black box oracle that can be queried in superposition, as a proxy for quantum time complexity. It behaves so well that it scarcely deserves to be called a computational model.
The big picture is that they use adversary methods and the state generation formalism due to Ambainis. The method defines a type of potential function and then bounds the rate of change from one query to the next. In the additive setting, they bound the difference, and in the multiplicative setting, they bound the ratio. The state generation method has the advantage that it allows the separation of lower bounds into two parts: the bound on the zero-error state generation and the minimization of this bound over the valid final Gram matrices of a successful protocol.
It is known that the multiplicative adversary method characterizes bounded-error quantum query complexity since the multiplicative adversary is at least as large as the additive one. One of the achievements of this paper is that they show that the multiplicative adversary method still works when the convergence ratio is strictly bounded away from one. This immediately implies a SDPT by the results of Spalek `08, but they reprove the SDPT from scratch so that they can get a better optimization of the parameters.
Conclusion and open problems: For Boolean functions, they show an XOR lemma. Does SDPT hold for all state generation problems? Is the multiplicative adversary method also tight in the case of large error?

André Chailloux and Or Sattath:
The Complexity of the Separable Hamiltonian Problem

First recall that QMA is the quantum analog of NP: it’s the set of problems which can be verified efficiently on a quantum computer when given a quantum proof (i.e. a quantum state for a witness). For QMA wth 2 provers, denoted QMA(2), there are two unentangled states which are given as proofs. It’s similar to interrogation of suspects: Author can play the two Merlins against each other and it is probably a much stronger class. (Each Merlin still sends poly(n) qubits, so QMA(2) is at least as strong as QMA, but is probably stronger.) Pure n-representability is known to be in QMA(2) but not known to be in QMA. It’s known that QMA(k) = QMA(2) for all k>2. We know only that QMA(2) $latex subseteq$ NEXP; this seems like a pretty weak upper bound!
The main result of this paper is to find a complete problem for QMA(2). The separable local Hamiltonian problem is actually in QMA, but the separable sparse Hamiltonian problem is QMA(2) complete.
The problem: given a Hamiltonian $latex H = sum_i h_i$ where each term is either local or separable, then decide whether there exists a separable state with energy at most a or if there is no state with energy less than b. (Where b-a > 1/poly(n) is the promise gap.)
Separable Local Hamiltonian: Given a local Hamiltonian H find a separable (say across an n by n qubit cut) $latex rho$ that minimizes $latex mathrm{tr} H rho$.
This sure doesn’t look like it’s in QMA!
If the prover sends $latex rho$, there’s no way to verify that it is a product, because you can’t test entanglement with a single copy.
Instead, the prover sends $latex rho$ AND classical descriptions of all k-qubit marginals (k comes from H being a k-local Hamiltonian). The consistency of these marginals is in QMA; the witness is simply $latex rho$. (In fact, this problem is QMA-complete.)
Separable Sparse Hamiltonian: A totally different story! Now it’s QMA(2) complete. (The first known such problem.) Immediately we see that the qubit structure doesn’t exist, so we can’t rely on the strategy of looking at marginals. Clearly the problem is in QMA(2). To show completeness, we adapt Kitaev’s Hamiltonian for encoding a circuit (his so-called clock construction.) One challenge is that if the QMA(2) verifier makes an entangled measurement, then the history state will be entangled, which is not of the form we want. We want to simulate a computation in which the input is entangled, but we don’t care about the later steps; however, we are given the ability to minimize the energy of the sparse Hamiltonian for separable states, period.
The trick is that 1001.0017 shows that QMA(2) = QMA(2)-SEP, which is defined to be the class where the verifier is restricted to making separable (i.e. non-entangling measurements, at least for the yes outcome). This measurement requires a controlled-SWAP on the two proofs, which interestingly, is sparse, but not local. You might think you can build this out of a bunch of small controlled-SWAPs, but this would produce entanglement at intermediate steps. SWAP has a complicated relationship with entanglement. It doesn’t entangle product states, but only if it’s applied to the entire state; applying it to subsystems can create entanglement.
Many of the same open questions remain, about the status of QMA vs QMA(2) (although there is evidence from the case of log-size proofs to suggest that they are different), and about whether pure N-representability is also QMA(2) complete.

Yaoyun Shi and Xiaodi Wu:
Epsilon-net method for optimizations over separable states

The QMA(2) session continues! Once upon a time, I (Aram) thought QMA(2) was the kind of thing that complexity theorists make up to have more things to talk about, but I’ve since made a total U-turn. The problem, especially it’s log-size-witness variant, is equivalent to a ridiculously large number of problems, like separability testing, mean-field Hamiltonians, computing minimum output Rényi entropies of channels, finding the largest singular value of tensors, estimating the $latex 2rightarrow 4$ norm of matrices, and many others. So it’s exciting that the only dimension-independent hardness result we know requires quantum techniques (1001.0017). And also the only (classical) approximation algorithm! (1011.2751)
This talk changes this, with a pair of elementary approximation algorithms, one of which matches the performance of a weaker variant of 1010.1750 (details below).
To understand the basic problem, we don’t need to think about provers or circuits. The problem is as follows: Given a Hermitian matrix H on $latex d^2 times d^2$ dimensions, estimate $latex {rm OptSep}(H) := max { {rm tr} H rho : rho in {rm Sep}(d,d)}$, where $latex {rm Sep}(d,d)$ denotes separable states with d dimensions for each party. This is a convex optimization problem, but it’s hard to test membership in $latex {rm Sep}(d,d)$; indeed, this has equivalent difficulty. Alternatively, we know that the maximum is achieved for a $latex rho$ which is a pure product state, where membership is easy to test, but this set is non-convex.
It is known that performing this optimization up to 1/poly(d) accuracy is NP-hard. (This result is originally due to Gurvits, although that’s not so widely known, since most people only noticed the part of his paper which talks about the hardness of testing membership in $latex {rm Sep}(d,d)$.) Constant accuracy is only known to be as hard as solving a 3-SAT instance of length $latex log^2(n)$.
This talk gives two algorithms for approximating OptSep.
One works well when H is nearly product; in particular, when H can be decomposed as $latex H=sum_{i=1}^m A_i otimes B_i$. The algorithm is to compute the joint numerical range of the $latex A_1,ldots,A_m$ and $latex B_1,ldots,B_m$, meaning the sets of possible $latex ({rm tr}rho A_1, ldots, {rm tr}rho A_m)$ and of possible $latex ({rm tr}rho B_1, ldots, {rm tr}rho B_m)$. If m is not too big, then this is reasonable. Moreover, we can enumerate over the corresponding epsilon-nets with only a small amount of space. This implies that when QMA(2) is restricted to nearly product measurements (i.e. $latex m leq mathrm{poly}(n)$), then the resulting class is contained in PSPACE.
The second algorithm does eigenspace enumeration directly on H, and achieves additive error $latex delta$ in time $latex {rm poly}(d) exp(|H|_2^2delta^{-2} log(|H|_2/delta))$. This matches one of the corollaries of 1011.2751, but not their main result.

Abel Molina and John Watrous:
Hedging bets with correlated quantum strategies

Alice and Bob are at it again, playing a nonlocal game. Here is the framework: Alice prepares a question and sends it to Bob, Bob responds by sending an answer to Alice. Based on Bob’s answer, as well as whatever memory she has of her own question, Alice decides whether Bob has passed or failed a test.
Alice’s actions are specified by a density operator and a two outcome POVM. We assume that Bob knows a complete description of Alice. Then the games that they consider here can be formulated as a semidefinite program to determine Bob’s optimal success probability for passing the test.
Let’s consider parallel repetition: Alice performs two tests independently, but Bob can correlate his strategy in an arbitrary way. Suppose that Bob’s optimal probability to pass a single instantiation of a given test is p, and Alice independently runs the test twice. There are several natural questions: What is the optimal probability for Bob to pass both or at least one test? To pass k out of n independent tests? If Bob uses an independent strategy, you can just calculate the success probability of that. Can he do better by correlating his strategies?
If Alice’s test is classical, then Bob gains no advantage. In the quantum setting, Bob’s optimal probability to pass both tests is again p2, so he gains no advantage by correlating the tests. Both facts can be proven using SDP duality.
It would be natural to conjecture that it is also optimal for Bob to play independently if he only cares about passing at least one test. But this turns out to be false. They have a simple counterexample to this natural conjecture, and they compute Bob’s optimal strategy for a simple nonlocal game. The strategy is a form of hedging.
This work is in a different setting than other results with non-classical correlations of measurement outcomes. It might give insight into possible attacks in cryptography, where breaking the system only once is all you need: you don’t try independent attacks, instead you entangle a joint attack! It also gives reduced errors in quantum interactive proof protocols. So far, they have proved some partial results along these lines.
Comment by Richard Cleve: There is a classical strategy for hedging in the CHSH game which can perfectly win 1 out of 2 parallel games.

Jop Briet and Thomas Vidick:
Explicit lower and upper bounds on the entangled value of multiplayer XOR games

Bell’s theorem says that the set of correlations $latex P[ab|xy]$ achievable quantumly is bigger than the set achievable classically. For this talk let’s focus only on $latex P[aoplus b | xy]$. Let’s look quantitatively at how the quantum advantage depends on parameters like the number of questions, the number of answers, the number of players, and the amount of entanglement (could also consider the type). Boosting this advantage is experimentally relevant to prove wrong all those quantum-mechanics deniers out there. You know the type.
For bipartite entanglement, we understand the situation pretty well now, but what about multipartite? (Another motivation is that quantifying multipartite entanglement is tough because it’s not clear what the goals are: this gives a nice clear way of keeping score.)
For a desired bias ratio R (meaning the entangled value is R times the unentangled value).

upper bound ($latex exists$ game)

lower bound (Best possible)

min questions

$latex O(R^4)$

$latex Omega(R^2)$.

min local dimensions

$latex O(R^2)$

$latex Omega(R^2)$

Here’s the outline of the proof.

  1. For any matrix (of appropriate size) we construct a 3-player XOR game.
  2. Relate R to spectral properties of matrix.
  3. Existence of a good matrix that gives a large separation

And the details.
1. Given R, choose n such that $latex 2^{n/2}=R$.
Give each player n qubits.
The game is to choose Paulis P,Q,R with probability $latex |{rm tr}M(Potimes Qotimes R)|^2$ (suitably normalized). And ask the players to output bits whose XOR gives the sign of $latex {rm tr}M(Potimes Qotimes R)$.
2. The entangled bias is at least the largest eigenvalue of $latex M$.
The classical bias is the 2-2-2 norm, = $latex max langle M, A otimes B otimes Crangle$ where the max is taken over A,B,C with Frobenius norm $latex leq 2^{n/4}$.
3. Choose $latex M$ randomly of course! Might as well make it a rank-1 projector, so the largest eigenvalue is 1, but other norms, like the 2-2-2 norm, are likely to be small.
Some open questions: They leave open a gap of size R2 in the number of questions needed for an R-sized bias ratio. It would be nice to get a tight analysis. Another open question is to derandomize their construction.

Gus Gutoski and Xiaodi Wu:
Parallel approximation of min-max problems
with applications to classical and quantum zero-sum games

Everyone knows that QIP=PSPACE. But did you also know that SQG=QRG(2)=PSPACE? (Short quantum games, and two-turn quantum refereed games, of course.) All of these follow from using the multiplicative weights update method to solve large SDPs in parallel. (Here is a review of the multiplicative weights update method that doesn’t discuss quantum applications, and here is a review that does).
This work concerns 2-turn quantum refereed games. Some motivation comes from complexity theory. We can consider interactive proof systems with two competing provers, one trying to prove Yes and the other trying to prove No.
The key result is that SQG is contained in PSPACE thus unifying and simplifying previous results about various games and interactive proofs being equivalent to PSPACE.
Additionally, they showed that public coin RG is not equal to RG unless PSPACE = EXP. Thus, the case of two competing provers is different from the case of a single untrusted prover (the usual IP situation) in which the messages to the prover can be replaced by public coins.
The key technical result is a way to parallelize an SDP related to whether certain quantum states exist. SDPs in general cannot be parallelized unless NC=P, but this result is possible because it deals with normalized quantum states going through trace-preserving quantum channels, and so the multiplicative weights update method (MWUM) can give a good approximation. Indeed MWUM has previously been used to quickly find solutions to classical zero-sum games (and convex programs in general), so it is natural to examine it for quantum zero-sum games. And even though there may be many rounds of interactions, these can be modeled by a series of linear and semidefinite constraints, applied to the density matrices appearing at various intermediate steps. This “transcript representation” was originally invented by Kitaev (whose ghostly presence looms large over this entire conference) in his work on coin flipping.
They also discussed penalization and rounding theorems and space efficiency.
The best part was a nice artistic depiction of PSPACE as a black hole gobbling up the other complexity classes, as shown at the top of the page!
 

QIP 2012 Day 2

Quantum Information Enters the Third Dimension

Itai Arad, Zeph Landau and Umesh Vazirani:
An improved area law for 1D frustration-free systems.

Some motivation: which quantum many-body systems can be simulated efficiently on a classical computer? Which states even have an efficient classical description (even if we might have trouble finding that description)? Can we efficiently describe their entanglement? These questions are important both theoretically and practically.

In this talk, we’ll address this question specifically focused on the case of ground states of gapped local Hamiltonians.
We care about this special case for reasons related to both physics and computer science. Local Hamiltonians are ubiquitous in condensed matter physics, and are the natural quantum version of constraint satisfaction problems in complexity theory.
For a state to be “efficient” in a meaningful sense, we should not just have a poly-sized classical description of the state. (For example, the local Hamiltonian itself implicitly gives such a description!) We should be able to efficiently calculate the expectation values of local observables.
For an arbitrary pure state, the Schmidt rank ($latex mathrm{SR}$) or the von Neumann entropy ($latex S$) can be as large as the total number of spins in one half of a bipartition. In condensed matter systems we expect that these measures of entanglement don’t have a volume scaling, but rather, a scaling proportional to the number of spins on the boundary of the region defining the bipartition. This is what is meant by an area law.
Why might you expect that an area law holds? One motivation is the exponential clustering theorem (Hastings & Koma `06, Nachtergaele & Sims `06) which says that gapped systems have connected correlation functions which decay exponentially. Namely, if $latex |Omegarangle$ is a ground state of a gapped system, then for local observables with disjoint support we have

$latex | langle ABrangle -langle Arangle langle B rangle | le c(A,B) expbigl(-mathrm{dist}(A,B), O(Delta E) bigr)$

Here the constants depend on the strength of interactions and the norms of the local observables.
But correlation is not the same as entanglement! For example, there
are data hiding states (Hayden et al. `04, Hastings `07). Bouding
the entanglement length is much harder than simply bounding
correlation because, in a sense, entanglement can “sum up” a bunch of weak correlations and possibly add up to something large.
The area law has only been proven rigorously for 1d systems (Hastings `07) and it is open in other settings. For a 1D system, the area law means that the entropy of any contiguous block is bounded from above by a constant. Some implications: the ground state has an efficient description in terms of an MPS and it explains why DMRG works so well in practice.
What about dimensions higher than one? The constants in Hastings’ proof are exponential in $latex X = log(d)/Delta E$. (Here d is local site dimension.) The best lower bounds (Hastings & Gottesman `09, Irani `09) are only polynomial: $latex S ge Delta E^{-1/4}$. Improving these constants in the 1d proof might be able to give nontrivial bounds in 2d by course graining one of the dimensions, so progress on these constants could have highly nontrivial implications.
The new result, an upper bound which has only polynomial dependence on $latex X$. The new bound is, up to polylog factors, $latex S le X^3$.
The proof techniques use Chebyshev polynomials an they might be useful elsewhere. They assume the Hamiltonian is frustration free, but this probably can be extended to the more general case of frustrated gapped local Hamiltonians.
Outline of the proof
The general strategy: start with a product state and approach the ground state without generating too much entanglement. For example, let’s assume that the projection to the ground state doesn’t create too much entanglement, as quantified by the Schmidt rank: $latex mathrm{SR}(Pi |phirangle) le D, mathrm{SR}(|phirangle)$ for some $latex D$. Then we could get a bound of $latex S le log D$ in terms of the von Neumann entropy.
We will have to use an approximate ground state projector (AGSP). It needs to have three properties (here $latex |Omegarangle$ is the unique ground state):

  1. $latex K | Omega rangle = |Omegarangle$
  2. $latex | K |Omega^perprangle|^2 le Delta | |Omega^perprangle|^2$
  3. $latex mathrm{SR}(K|phirangle) le D, mathrm{SR}(|phirangle)$

Three factors determine the bound on $latex S$: $latex D$, which measures how much entanglement we create; the shrinking factor $latex Delta$, which determines how fast we approach the ground state with each successive application of $latex K$; and $latex mu$, the overlap between the initial product state and the ground state.
Lemma: $latex S(Omega) le O(1) dfrac{log mu}{log Delta} log D .$
When $latex D Delta < frac12$, then there is a product state with big overlap with the ground state ($latex mu$ isn’t too small). This is the bootstrap lemma. As a corollary, then for any AGSP we have an area law.
Now to prove the area law, we need to actually find a good AGSP. If it is frustration free, we can consider local projectors w.l.o.g. Then we can use the detectability lemma to get the AGSP.
The idea of the detectability lemma is to split the projectors into two layers, the product projector onto the even and odd terms in the Hamiltonian, respectively. Let $latex A = Pi_ePi_o$. Then we have

$latex |A |Omega^perprangle|^2 le (1+Delta E/2)^{-2/3} | |Omega^perprangle|^2.$

Therefore, A is an AGSP and the Schmidt rank increases by at most $latex D = d^2$ and shrinkage factor satisfies $latex Delta = (1+Delta E/2)^{-2/3}$. But we need a better AGSP to get an improvement to the area law.
Main idea: “dilute” A a bit so that we don’t let the entanglement accumulate too much. Can we dilute A so that the Schmidt rank decreases, but without worsening the shrinking factor by too much?
We define a new operator $latex mathbb{N} = sum_i Q_i$. It counts the number of constraint violations in a particular layer. It is a polynomial of degree m (the number of terms in the layer). Can we approximate it with a lower degree polynomial? Yes, and this is where the Chebyshev polynomials enter. Taking the Chebyshev polynomial of degree $latex sqrt{m}$ gives a good approximation, but with a much lower degree. But now the number of terms is large.
We have to bound the number of terms in the sum. The bound is $latex t = expbigl(O(log^2(q sqrt{m}) ellbigr)$. The overall Schmidt rank is at most $latex D^ell$.
Now we have to glue everything together. We still have to initially course grain the system a bit to deal with the case when $latex Delta D > 1$. (Recall, we need this product to be less than one half in order to get the first lemma to work).
Open questions: efficient representation for ground states of gapped local Hamiltonians in higher dimensions? Area law for higher dimensions? Can the proof be generalized to the frustrated case? Can this idea of “dilution” be used elsewhere?

Spyridon Michalakis and Justyna Pytel:
Stability of Frustration-Free Hamiltonians

I already blogged about this work (twice!) so I’ll just refer to that.

Toby Cubitt, Martin Schwarz, Frank Verstraete, Or Sattath and Itai Arad:
Three Proofs of a Constructive Commuting Quantum Lovasz Local Lemma

The Lovász Local Lemma (LLL) says that if several events are “mostly independent”, then the probability that none of these events occurs is still strictly positive. The constructive version of the LLL, whose most recent instantiation is due to Moser and Moser & Tardos, shows that the simplest algorithm you can imagine converges to a solution: just start with a random assignment and then check random constraints which are violated, fix them with a random assignment, and repeat until you reach a satisfying assignment. The quantum version of the LLL replaces classical “events” with projectors, but the current version is nonconstructive. Can we find a constructive proof in the quantum case?
There are two challenges: entanglement and the measurement-disturbance tradeoff. This talk will focus on the challenge of entanglement, since this challenge is present even in the presence of commuting Hamiltonians.
In the simplest generalization of Moser & Tardos (probabilistic version) to the quantum case, we just replace local qubits with the maximally mixed state. This is the Verstraete et al. `09 dissipative map, so we know it converges to the correct state, but we can’t (yet) bound the convergence rate.
Key lemma: We can bound the probability of generating a particular sampling sequence, i.e. a particular directed acyclic graph (DAG). Next, we can bound the expected number of violations, then use this as a subroutine in an efficient algorithm. This allows us to bound the runtime.
Second approach: use the original entropy compression argument of Moser. This goes through without too much effort. The algorithm halts in expected linear time.
What about the noncommutative case? Now there are two complementary problems: in the first proof, we can no longer bound the runtime. In the info-theoretic second proof, there is no guarantee that it halts on the correct state (though it definitely will halt). There are two conjectures which, if true, would enable a noncommutative constructive QLLL. See the paper for details.
Toby, you went way too fast for me to keep up, sorry.

Norbert Schuch:
Complexity of commuting Hamiltonians on a square lattice of qubits

I also blogged about this one previously.

Josh Cadney, Noah Linden and Andreas Winter (contributed talk):
Infinitely many constrained inequalities for the von Neumann entropy

The speaker addresses a central question which affects all of us: Why is this talk better than lunch? Answer: it’s different; it’s introductory.
The von Neumann entropy satisfies many inequalities, for example, positivity and the strong subadditivity inequality. In this talk, we are interested only in inequalities based on reduced states that are dimension independent.
Define the entropy vector: for any subset of systems $latex alpha$, we define reduced state on $latex alpha$, and the associated entropy. Stack all of these entropies into one big vector. If we fix the number of parties to be n, then we let $latex Sigma_n$ be the set of all entropy vectors of n-partite states. We don’t make restrictions on the local Hilbert space dimension. We want to characterize this entropy vector (as a function of n). Since the von Neumann entropy tells us about bipartite entanglement, than this might tell us something about multipartite entanglement.
If we look at all $latex Sigma_n$, then we get a convex cone once we take a closure. Call this the entropy cone. What sort of inequalities do we know which characterize this entropy cone?

  • For one party, we know positivity.
  • For two parties, we know subadditivity and triangle inequality.
  • For three parties, we know strong subadditivity and weak monotonicity.

Do these existing inequalities tell us everything we need to know about the entropy cone? To check this, we look at the extremal rays and try to find an entropy vector there. For n=2, we find that there are no more inequalities: the above list is exhaustive. For three parties, we find four families of rays, and again there are no more inequalities.
For four parties, we can continue this program. The extremal rays are stabilizer states! Are there any more four party inequalities? We don’t yet know. To help address the question, we first look at the classical case (Shannon entropy). In the classical case we additionally have (non-weak) monotonicity. Playing the same game in the classical system, we learn from previous work that the four-party cone isn’t polyhedral: there are an infinite number of entropy inequalities.

New result: given a constraint on the mutual information $latex I(A:C|B) = I(B:C|A) = 0$, then there are new inequalities, and additional inequalities each time we add a new party. See the paper for details. The proof elaborates on a classical proof by Makarychev et al. from `02. Each of the inequalities is independent: every new inequality sharpens the bound on the entropy cone in a different way than the others. They have not been able to violate a single one of the new classical inequalities numerically.

Jeongwan Haah:
Local stabilizer codes in three dimensions without string logical operators

The general problem is to find a noise-free subspace/subsystem of a physical system on which manipulations are still possible. For a many-body system, local indistinguishability (aka topological order) is stable at zero temperature, but might be fragile at nonzero temperature. Here we are focusing on the case of memory only (so, only in the ground state).

First a review of the toric code. Because the logical operators are string-like, the toric code is not stable at nonzero temperature. This talk will try to build a code which doesn’t have this drawback.

The code is defined on a cubic lattice with 2 qubits on each vertex. There are no string-like logical operators in this model. This system has an exotic ground state degeneracy which depends on the system size.

Some general considerations:

  • Simple cubic lattice with m qubits per site
  • t types of cubic interactions with code space dimension > 1
  • No single-site logical operators
  • No obvious string operator

To simplify things:

  • t=2: one for X and one for Z type Paulis
  • Product of all terms in the Hamiltonian should be Id
  • Logical operator on a site should be Id
  • Logical operator on a straight line should be Id

These conditions can be translated to linear constraint equations on the symplectic vector space of the Abelianized Pauli group. There are 64 commutation relations on a single unit cell of a cubic lattice which determine everything and there are 27 independent linear constraints (from the above considerations). There are 37 free variables, and 30 remaining equations. If you brute force solve everything, then you find there are 17 solutions.
The first code has a Z<–>X symmetry and looks like this:
Some questions: Is this model topologically ordered? What do you mean by a “string”?
To answer the first condition, use the local topological order condition from Bravyi, Hastings, Michalakis. That is, observables with local support should not be distinguishable in the ground state. One can check this condition and it is satisfied. The computation simplifies if you take advantage of the symmetries.
Let’s get a rigorous definition of “string”. This is a key contribution of this work. A string is a finite Pauli operator which creates excitations at at most two locations. Define an “anchor” to be an envelope around the excitations. The “width” is the size of the anchors, the “length” is the distance between the anchors. There are no geometric restrictions. We need to tweak this definition a bit: we need a notion of a “trivial” string segment to rule out some annoying counterexamples.
The no-strings rule: String segments of width w and length L > c w are always trivial (where c is some constant). The above code satisfies the no-strings rule with c=15.
One can prove this using the same techniques developed above to check the local TQO condition. Jeongwan calls this the eraser technique.
To understand what the no-strings rule means, you have to develop some algebraic tools. Physically, it means you can’t drag defects; you can only annihilate them and then recreate them elsewhere. Jeongwan spent some effort describing these tools, but it was way too much for me to write all of it down (well, at least if I wanted to understand any of it). You’ll have to look at his papers for details.
Jeongwan specifically showed how both the local TQO and no-strings rule could be cast in the language of algebras of polynomials to give simple proofs for the case of his codes.

Andrew Landahl, Jonas Anderson and Patrick Rice:
Fault-tolerant quantum computing with color codes

Haah’s nice code makes me want to think more about 3d
But my 3d glasses here (puts on 3d color glasses) can make 2d color codes look 3 dimensional (see top of post.)
Consider color codes. One of the codes (the square-octagon lattice code, aka the 4,8,8 lattice) has a transversal S gate.
The control model: there is a faulty gate basis consisting of X, Z, H, S, $latex S^dagger$, CNOT, measurements on X and Z, and state preparations on 0, + and $latex pi/4$. Additional assumptions: parallel ops, refreshable ancillas, fast classical computation, equal-time gates.
Locality assumptions: 2D layout, local quantum processing.
We also consider two error channels: bit flips and double Pauli (DP).
Consider three models of error:
Circuit based noise model:

  • each preparation and one-qubit gate followed by BP(p)
  • Each CNOT gate followed by DP(p)
  • Each measurement preceded by BP(p) and result flipped with prob p

Phenomenological noise model:

  • Same, except each syndrome bit extraction circuit modeled as a measurement that fails with prob p. This ignores noise propagation between data and ancilla.

One other noise model is the code capacity noise model. See paper for details. 🙂
Decoders and thresholds: Can use an MLE decoder or an optimal decoder (which might be slow). The threshold seems to be around 10.9% for the optimal decoder, and not much worse for the MLE decoder: Some numerical results (depending on model) are 10.56%, 3.05%, $latex 8times10^{-4}$%

(An aside.) Random-bond Ising model: Consider an Ising model with random signs on the two-body couplings. Every CSS code has an associated random-bond Ising model. The Nishimori conjecture (which is false!) said that you can’t increase the order by increasing the temperature.
Details on syndrome extraction: One can use sequential schedules or interleaved schedules. We need to make sure that the code checks propagate to themselves, and syndrome checks propagate to themselves multiplied by a stabilizer element. They found an interleaving schedule that works.
Decoding: Use the code-capacity MLE decoder (works for all CSS codes). They formulate it as an integer program over GF(2), which can be rewritten as an integer program over the reals. (This isn’t efficient (NP-complete in general).)
Showed some pictures of the thresholds.
Turns out the interleaved schedule doesn’t have much impact… it’s about the same threshold (within error) as the sequential schedule.
Also showed some rigorous lower bounds which come from a self-avoiding walk.
Architectures and computation: can imagine a pancake architecture which does gate transversally between layers. How does this change the threshold?
Good point: Injection of magic states into codes hasn’t really been studied in the case where the injection is error-prone. This would be a good topic for future work.

Guillaume Duclos-Cianci, Héctor Bombin and David Poulin:
Equivalence of Topological Codes and Fast Decoding Algorithms

Note that I blogged about this previously.
Two gapped systems are in the same phase if there are connected by a local unitary transformation, meaning there is constant-depth quantum circuit (CDQC) which transforms one into the other.[Wen et al., 2010, Schuch et al. 2010] What makes two codes equivalent? Under what conditions does a CDQC exist?
Consider two copies of the Kitaev toric code (KTC). Does this create a new code? No… However, you could imagine applying a CDQC to the pair of codes which masks this product code. Thinking in reverse, you could ask which codes have such a character, namely, that they are loosely coupled versions of multiple copies of the KTC.
The main result is (roughly) that every 2D stabilizer code on qubits is equivalent (modulo CDQC) to several copies of the KTC, and moreover the mapping can be constructed explicitly.
Some useful concepts. Topological charges are equivalence classes of excitations which can be fused within a fixed bounded region back into the vacuum configuration. Excitation configurations are collections of defects inside a region. String operators move defects around.
The result applies in the following setting. They consider 2D lattices of qubits with geometrically local, translationally invariant stabilizer generators. To have a topological-like condition, they ask that the distance of the code scales with the linear size of the lattice ($latex d sim L$).
Gave a sketch of the proof… too detailed to blog in real time. See the paper for details.
Application to quantum error correction (QEC): The decoding problem is the same as trying to infer the worldline homology class from the measured particle configuration. You can decode, say, the 4,8,8 color code by deducing a mapping of the charges, then deducing an effective noise model on the KTCs, and then applying your favorite decoding algorithm for each KTC.
They did some numerics and found 8.7% fault-tolerance threshold for the 4,8,8 code using the RG decoder.

Joseph M. Renes, Frederic Dupuis and Renato Renner (contributed talk):
Quantum Polar Coding

Joe started off by disappointing the audience with an announcement that his all-ETHZ team had written a paper without any smooth entropies.
Instead, he talked about quantum polar codes, which are explicit, channel-adapted block quotes (CSS codes, in fact), with linear-time encoding and decoding, and achieve the symmetric (see below) coherent information for Pauli channels. They appear to need entanglement assistance, but maybe not? Details to follow.
The key idea is channel polarization. If we have two independent bits $latex X_1, X_2$ going into two bit channels with outputs $latex Y_1, Y_2$, then we can do a change of basis on the input to $latex U_1 = X_1oplus X_2, U_2 = X_2$. We think of this as a channel from $latex U_1$ to $latex Y_1Y_2$, together with a channel from $latex U_2$ to $latex Y_1Y_2$ that uses our imperfect knowledge of $latex U_1$ as side information. The first channel has less mutual information than our original channel, and the second channel has more. Hence we have “polarized” the channels, essentially taking the mutual information from one use and giving it to a different one.
The basic relation is:
$latex I(U_1 ; Y_1 Y_2) + I(U_2 ; U_1Y_1Y_2) = 2 I(W)$
where I(W) refers to the mutual information of the original channel.
If we keep doing this recursively, we can continue until we get a bunch of nearly-perfect (good) channels and a bunch of nearly-worthless (bad) channels. Remarkably, the fraction of good channels is close to the input capacity given uniform inputs (which we call the “symmetric capacity”). To encode, we send messages over the good channels, and freeze inputs to bad channels (both parties know which ones are which). This requires only $latex nlog n$ CNOT gates. You can decode sequentially using ML; the recursive structure of the code makes ML efficient.
The quantum version just uses the fact that the transformation is good in the conjugate basis as well! You just turn the circuit “upside-down” (CNOTs reverse the target and control). For a Pauli channel, the polarization occurs in both amplitude AND phase, albeit in different directions. We only need to analyze amplitude and phase in order to verify that a channel can send general quantum states (see Christandl & Winter 2005 for the precise statement).
Some channels are going to be good for both amplitude and phase, so we use those to send entanglement. For the ones which are bad on amplitude, we freeze the amplitude, and similarly, for the ones which are bad on phase, we freeze the phase. We can use preshared entanglement for the channels which are bad on both amplitude and phase.
To decode, just successively and coherently decode both the channels. Reuse the classical decoders. Decode first amplitude, then phase conditional on the given amplitude. You can achieve the (symmetric) coherent information as the net rate.
From the numerical simulations, (see picture) it seems that this entanglement assistance is never actually needed!
In fact, for channels with low enough error rates they can show this rigorously. The proof techniques are similar to the original polar code proof techniques and use channel martingales.
One open question is to resolve whether or not the entanglement assistance is really necessary. Also, can they extend this to other channels? Yes! It works (joint work in progress with M. Wilde.) Plus, you can still do sequential decoding.

Sergey Bravyi and Robert Koenig:
Disorder-assisted error correction in Majorana chains

Consider, if you haven’t already, the toric code Hamiltonian $latex H_0$ given by minus the sum of toric code stabilizer generators. Now perturb it lightly by adding local Z fields: $latex V= -msum_i Z_i$. What does the storage fidelity look like (at zero temperature) as a function of time? That is, what happens if we encode an adversarially chosen pair of qubits, wait a time t, and then try to recover the qubits? Let Tstorage denote the longest time we can wait and still get 99% fidelity.
The bad news: $latex T_{rm storage} = O(log N)$.
Crazy idea: Add randomness!
The idea is that the perturbation creates particles, which then hop around, creating strings that quickly lead to logical errors. But adding a little disorder can slow them down via the phenomenon of Anderson localization! In fact, the eigenfunctions can be exponentially localized. This is a nice example of a phenomenon in which something that sounds like a negative result—disorder making quantum particles move even more slowly than diffusion—turns into a postiive result, when we can use it to limit our old enemy, Eve.
Challenges: need it to work for many particles, and in 2-D. This is intractable to simulate classically.
So… instead we consider a 1D model that we can analyze, viz. unpaired Majorana modes in quantum wires (due, of course, to Kitaev).
The Hamiltonian is a sum over three types of terms: a chemical potential ($latex a_j^dagger a_j$), a hopping term ($latex a_j^dagger a_{j+1} + h.c.$), and a Cooper pair term ($latex Delta^*a_i^dagger a_j^dagger + Delta a_ia_j$). The base Hamiltonian has a 2-fold degenerate ground state (thus protecting a qubit). With respect to Majorana operators $latex c_j$, the Hamiltonian becomes the super simple:
$latex frac{J}{2} sum_{j=1}^{N-1} i c_{2j} c_{2j+1}$
The perturbation is $latex V = sum_{j=1}^{N-1} i mu c_{2j-1} c_{2j}$.
The resulting error-correction procedure resembles the minimum-weight matching procedure for the toric code, but is much simpler because it works on a line.
This code resembles the lowly repetition code, but actually has nontrivial distance.
Now what happens if the perturbation has randomly-varying weights. That is, $latex V = sum_{j=1}^{N-1} i mu_j c_{2j-1} c_{2j}$ for random $latex mu_j$.
The expected storage time results can be summarized in the following table.

no disorder disorder implications
weak or no perturbations, with small N $latex exp(Omega(N))$ $latex exp(Omega(N))$ not so nice…
strong perturbations, with small N $latex O(log N)$ $latex Omega(N)$ nice…
large N $latex O(log N)$ $latex exp(Omega(N))$ very nice!

Proofs are a mix of rigorous proofs and simulation.
Dynamical localization resembles that obtained for the 1-paricle Anderson model.
$latex c_q(t) = sum_p R_{p,q}(t) c_p$ with $latex |R_{p,q}(t)| leq e^{-|p-q|/t}$.
That’s only for one particle, though. They actually need to prove a bound on the expected determinant of submatrices of $latex R$. Using this, they can show the fidelity remains high for an exponentially long time. This is interesting not only because of its implications for quantum memory, but because understanding multi-particle localization is a tough problem in general.
The proof uses quasi-adiabatic continuation for free fermions to argue that a superposition in the ground space is mostly preserved. Or more precisely, that the only damage occuring to it consists of correctable errors. The techniques look important and interesting. Properties such as gap stability that they prove might of interest elsewhere.
But they don’t solve everything: in some cases, they need numerical simulation. They use a number of tricks to tame the exponentials that would come from naive algorithms; in particular, taking advantage of the fact that they are working with fermionic Gaussian states (building on, and improving, an old result of Terhal and DiVincenzo).

QIP 2012 Day 1

Sergey Bravyi, Topological qubits: stability against thermal noise.
Joint work with Jeongwan Haah, based on arXiv:1105.4159.


Sergey opened his talk with an image of a bagel in a Solfatara landscape, meant to evoke topogically-protected qubits (get it? a bagel?) in a thermal (steam) bath.
The question is whether we can engineer an energy landscape with high barriers between encoded states so as to enable passive quantum error correction; i.e. to create a topologically protected qubit. Here are the main ingredients:

  • Stabilizer codes on lattice with local generators and code distance $latex dapprox L$, where $latex L$ is the lattice size.
  • A code Hamiltonian with energy penalty for states outside logical subspace. The Hamiltonian is defined simply by having a violated stabilizer cost unit energy.
  • Thermal noise: should be described by a Markovian master equation with local jump operators. The spectral density should obey detailed balance; this will be the place where temperature enters.
  • Finally, there should be a decoder. Here we consider the RG (renormalization group) decoder.

Kitaev’s toric code (1997) was the first example of a topological QECC. If you’re not familiar with it, then you should read about that instead of the current results. The key idea is that the errors are strings that wrap around the torus, and so have length at least $latex L$. However, since a partial string incurs only a constant energy penalty, the memory time (=amount of time until the state degrades by a constant amount) is $latex e^{-2beta}$ [arXiv:0810.4584] independent of the lattice size! This is pretty disappointing; we can get this performance by not encoding the qubit at all.
Indeed, any 2D topological stabilizer code has constant energy barrier [0810.1983], which implies that relaxation to thermal equilibrium occurs at a rate independent of system size $latex L$.
This no-go result is daunting, but in higher dimensions we can still achieve some nontrivial topological protection. Here’s what we want: A topological stabilizer codes in a D-dimensional array with commuting Pauli stabilizers $latex G_a$, each with support of size O(1), and with each qubit acted on by O(1) stabilizers. The ground space of the corresponding Hamiltonian is the set of states that are +1 eigenvectors of every stabilizer. We would like the distance to grow at least linearly with lattice size; such models have intrinsic robustness.
Fortunately, there is a shining example of a spatially local stabilizer code with all the self-correcting properties we want. Unfortunately, it requires D=4; it is the 4-d toric code. Here is its energy landscape.

Defects are loops around error membranes, and any sequence of single-qubit Pauli errors mapping a ground state to an orthogonal one must create roughly $latex L$ defects at some step. Thus the memory time is exponential in $latex L$ for low $latex T$ [0811.0033].
All 3D stabilizer Hamiltonians studied so far have constant energy barrier. Indeed this is true of all 3D codes that are invariant under translation and “scale invariance” [1103.1885]. (Here, scale invariance means that the ground state degeneracy doesn’t depend on the size of the lattice.) Haah’s breakthrough 1101.1962 is the first example of a 3-d stabilizer code without string-like logical operators. In his model, defects cannot move very far.
Our results give self-correcting properties of any topological stabilizer codes that obey the no-strings rule, e.g. Haah’s codes.

  1. Their energy barrier is $latex Omega(log L)$
  2. They have partial self correction. Specifically, the Arrhenius law together with the logarithmic energy barrier gives lifetime $latex L^{cbeta}$, for any inverse temperature $latex beta>0$, and for $latex Lleq L^*$ for some critical value $latex L^*sim e^{beta/3}$. See figure:

This lifetime reaches a maximum value (as a function of L) when L is proportional to $latex e^{beta/3}$. Thus, choosing L optimally results in a memory lifetime of $latex T_{mathrm{mem}} sim e^{cbeta^3/3}$.
One example of a code with no string-like errors is the 3-d cubic code of Haah. It is obtained by placing two qubits at each site of a cubic lattice and tiling the following stabilizer generators. See figure (from the Bravyi-Haah paper).

The noise model: One can get a Markovian noise model by considering the Davies weak coupling limit, where the Lindblad operator $latex A_{k,omega}$ transfers energy $latex omega$ from the system to the bath. The spectral density $latex r(omega)$ obeys detailed balance: $latex r(omega) = exp(-beta omega) r(omega)$. Note that this is the only place where temperature enters.
The decoder: After measuring the syndrome, the correction operator is found using ideas from Jim Harrington’s thesis [2004; I can’t find it online] and the Renormalization Group decoder from 1006.1362. See photo.
The strategy is to:

  1. Find connected clusters
  2. For each connected cluster C find the min enclosing box and try to annihilate C by an operator acting within the box.
  3. Stop if no defects left.
  4. If some defects are left, then increase the length by factor of 2 and repeat.

The RG decoder stops when all defects have been annihilated or when length reaches the lattice size, in which case we have failure. We can also have failure when all defects have been annihilated but the system has been returned to a wrong ground state.
The RG decoder can be implemented in time poly(L) (and can be parallelized to run in time log(L)).
We can define the memory time as follows. We time evolve according to the master equation given the local quantum jump Lindblad operators. After some time $latex t$, we run the decoder. Then we check the worst-case (over all initial states) of the trace distance between the initial and final state.
$latex | mathcal{D}(rho(t)) – rho(0) |_1 le O(t) N 2^k exp(-beta m) (1+exp(-beta))^N.$
Here N is the number of physical qubits, k is the number of logical qubits, t is the time, $latex beta$ is inverse temperature and m is the number of errors that can be corrected by the decoder (or more precisely, the height of the energy barrier.) Essentially, there is a temperature/energy factor that competes with an entropic factor.
Analyzing it we find that the memory time grows greater than $latex L^(c beta)$ as long as $latex L le exp(beta/3)$
This gives large but finite lower bound on memory time at a fixed temperature.
Is it optimal? Maybe the analysis is pessimistic and the Haah code actually has an exponentially large lifetime? Maybe growing L can result in unbounded memory time when $latex beta$ is sufficiently large? The authors carried out a Monte Carlo simulation of Haah code with only X errors to test these theories. It used 1000 days of CPU time on IBM’s Blue Gene.
The plots are qualitatively similar to the lower bounds on memory time. In particular, they show that the maximum memory time scales quadratically with $latex beta$ and that the memory time for fixed temperature increases with $latex L$ and then starts to decrease.
How do the proofs go?
Idea 1: The no-strings rule implies localization of errors. That is, any error E can be written as $latex E=E_{loc} cdot G$ where G is a stabilizer and $latex E_{loc}$ is supported on at most 2 neighborhoods.
In order for the accumulated error to have large weight at least one intermediate syndrome must be nonsparse, with one pair of defects within distance a of each other.
Idea 2: Uses scale invariance of the nostrings rule.
Define sparseness and denseness at different scales. A syndrome which is dense at p consecutive scales will include a cluster of p defects.
Show that to implement a logical operation, at least one intermediate syndrome must be dense at roughly log(L) differnet scales.

Mark Wilde,
Advances in classical communication for network quantum information theory, based on 1109.0802 and 1102.2624

Consider the problem of an interference channel, with two senders and two receivers. This problem is hard enough in the “ccqq” case, meaning that the channel takes two classical inputs to a bipartite quantum output: i.e. $latex x_1,x_2 rightarrow rho_{x_1,x_2}^{B_1B_2}$. This is depicted here (figure taken from their paper).

Along the way to understanding this, we’ll need to consider a multiple-access channel, with two senders and one receiver. The achievable rate region is in a sense an optimistic one, bounded by the obvious constraints:

$latex R_1 leq I(X_1;B|X_2)$
$latex R_2 leq I(X_1;B|X_2)$
$latex R_1 + R_2 leq I(X_1X_2;B)$

Why are these obvious? The first two constraints are what we would obtain if the receiver has successfully decoded one message, and both parties get to use that information to design a joint encoder-decoder pair. The last constraint is what we would get if the two senders could work together, apart from the requirement that their average input to the channel should be product across the two senders. Thus, these are natural upper buonds. The nonobvious part is that this can be achieved.
The coding theorem was obtained by Andreas Winter, when he was a grad student (quant-ph/9807019). How does it work? First, we observe that the desired rate region can be obtained as a convex combination of the “corner points” of the rate region: $latex (I(X_1;B), I(X_2;B|X_1))$ and $latex (I(X_1;B|X_2), I(X_2;B))$.
To achieve one of these corner points, use the fact that the two senders are sending uncorrelated inputs, so we can average over the inputs of one sender (say $latex X_2$) to obtain an effective single-sender/single-receiver channel, for which coding is possible at rate $Iatex I(X_1;B)$. Of course, measurement is potentially destructive, but since we are operating in the regime of very low error, we can use the fact that measurements with nearly certain outcomes cause very little damage (the infamous “Tender Measurement Lemma”). (Ok, it’s mostly called the “gentle measurement lemma” but I like “tender.”) Thus, the decoder obtains $latex X_1$ and, conditioned on it, can decode $latex X_2$. (Clearly the sender knows $latex X_1$ as well. Note that this is no longer true for the multiple-sender case.)
Sequential Decoding:. To do decoding, we need to use the typical projector for the average state, as well as the conditionally typical projectors for codewords. The HSW idea is to use the pretty good measurement. The idea of sequential decoding is, in analogy with the classical idea, is to check each codeword sequentially. It works out pretty similarly, with the measurement operators in both cases being lightly distorted versions of the conditionally typical projectors.
The crucial lemma making this possible is from 1109.0802, by Pranab Sen. He calls it a “noncommutative union bound”. The statement is that

$latex 1- mathrm{tr}(Pi_1cdotsPi_n rho) le 2 sqrt{sum_{i=1}^n mathrm{tr}((1-Pi_i) rho)}$

Successive decoding: In general, we’d like to analyze these problems of multipartite information theory using the traditional single sender-singler receiver tools, like HSW. Since each individual code works with exponentially small error, and the gentle-measurement Lemma states that decoding causes exponentially small damage, we should be able to compose several protocols without much trouble. The only subtlety is that the typical projectors don’t commute, which is where the noncommutative union bound comes in. We apply it to the following “intersection projectors.”
$latex tildePi_{x_1^n, x_2^n} leq Pi Pi_{x_1^n} Pi_{x_1^n, x_2^n} Pi_{x_1^n} Pi$
Most important open problem: find a general three-sender quantum simultaneous decoder. Solving this should hopefully yield the insights required to handle an unlimited number of senders.

Nilanjana Datta,
Quantum rate distortion, reverse Shannon theorems and source channel separation,
joint work with Min-Hsiu Hsieh, Mark Wilde, based on 1108.4940.

Compression and transmission: the source coding and noisy coding theorems of Shannon. The fundamental limit on compression is the entropy.
Lossless compression is often too stringent a condition. For example, consider jpg compression, which gives faithful images (to a human eye) despite throwing away lots of bits of information. We consider Rate Distortion Theory, i.e. the theory of lossy data compression. We are interested in the maximum rate of compression given a fixed maximum amount of distortion. Define the rate distortion function:
$latex R_c(D) =$ minimum classical asymptotic cost for sending many copies of state $rho$ with per-copy distortion $latex leq D$, where $latex c$ is for classical.
For a fixed value of the distortion function $latex 0le D < 1$, we work in the storage setting. Define a quantum rate distortion function in the asymptotic limit.
Barnum (in quant-ph/9806065) proved the first lower bound on the quantum rate distortion function:
$latex R_q(D) ge min_{S_rho(N,D)} I_c(rho,N)$
where the term on the right is the coherent information and $latex S_rho(N,D) = {N:mathrm{CPTP} : d(rho,N)le D}.$ But, the coherent information can be negative, so it can’t be tight lower bound in all cases.
Now move to the communication setting. Can define the entanglement-assisted quantum rate distortion function. They can find a single-letter formula for it, given by $latex min_{S_rho(N,D)} frac12 I(R:B)_omega$. (In terms of the quantum mutual information.) This is both a lower bound and it is achievable by using quantum state splitting. Alternatively, the achievability follows from the quantum reverse Shannon theorem (QRST).
Unassisted quantum rate distortion function can also be found by the using the QRST. Need to use a regularized formula.

Jon Yard,
Quantum communication with Gaussian channels of zero quantum capacity,
joint work with Graeme Smith and John A. Smolin, based on 1102.4580.

The title on arxiv.org is “Gaussian bosonic synergy: quantum communication via realistic channels of zero quantum capacity”. Realistic? Synergy?? Think about this, kids, before you go off to work in industrial research.
This paper concerns the fascinating topic of channels with zero quantum capacity. For zero classical capacity, these channels are simple to describe. Here is one example.

But zero quantum capacity occurs even for channels whose output depends on the input. For example, consider the completely dephasing channel, aka a classical channel. Obviously this has no classical capacity. The fact that this channel has zero capacity is both because it is anti-degradable (meaning that Eve’s output can simulate Bob’s; this implies 0-capacity by no-cloning) and because it is PPT, meaning it maps every input state to a PPT output state, or equivalently, its Jamiolkowski state (obtained by feeding in half of a maximally entangled state) is PPT. Sadly, these two conditions are still pretty much our only examples of zero-capacity channels. See a previous talk of Graeme’s for a framework that could potentially expand this set of conditions.
Let’s talk a bit more about those two examples. The anti-degradable channels are intuitively those that give more to Eve than Bob. So erasure channels with erasure rate >50% count, attenuation channels (for photons; can be modeled by a beamsplitter) and certain depolarizing channels (using an argument due to Brandao, Oppenheim, and Strelchuk http://arxiv.org/abs/1107.4385). On the other hand, PPT channels are at least easy to calculate, and include some channels with zero private capacity.
The general question of characterizing zero-capacity channels is a very interesting one, and one that it’s not clear if we are up to. But here is a nice specific version of the problem. The capacity of the depolarizing channel drops to zero at noise rate $latex p^*$, where $latex 0.2552 leq p^* leq 1/3$. What is $latex p^*$???
A good quote from Jon.

Superactivation demonstrates that asking about the capacity of a quantum channel is like asking about a person’s IQ: one number isn’t enough to capture everything you might want to know.

The main result: There exist Gaussian channels, each with zero quantum capacity, that can be combined to achieve nonzero capacity. These channels are more or less within the reach of current experiments (not sure about all the decoding), requiring about 10dB of squeezing and about 60 photons/channel. This is interesting in part because Gaussian channels seemed to be so well-behaved! For example, there is no NPT bound entanglement in Gaussian channels.
Infinite dimensions: This paper works in infinite dimensions, which many in our field are inherently uncomfortable with, and others are inexplicably drawn to. Like representation theory, or complexity classes with lots of boldface capital letters, the presence of phrases like “Quasi-free maps on the CCR algebra” can be either a good or a bad thing, depending on your perspective.
Q: For those of us who of prefer finite dimensions, can we truncate the Hilbert space, perhaps by taking advantage of a constraint on the total number of photons?
A: In principle yes, but then the channels are no longer Gaussian, so our analysis doesn’t work, and in general, Gaussian things are easier to analyze, so that is a sense in which infinite dimensions can actually be simpler to deal with.

Runyao Duan,
Bounds on the distance between a unital quantum channel and the convex hull of unitary channels, with applications to the asymptotic quantum Birkhoff conjecture,
joint work with Nengkun Yu and Quanhua Xu.

The story behind this work starts with the Birkhoff-von Neumann theorem (often called Birkhoff’s theorem) which states that doubly stochastic matrices (matrices with nonnegative entries and all rows and columns summing to one) are convex combinations of permutations. An analogous claim for quantum channels is that unital channels (i.e. mapping the maximally mixed state to itself) are mixtures of unitaries. However, this claim is false. More recently, Smolin, Verstraete and Winter conjectured in quant-ph/0505038 that many copes of a unital channel should asymptotically approach convex combinations of unitaries. (The rest of their paper contained evidence suggestive of this claim, and in general has nice results that are an important precursor of the merging revolution in quantum information theory.) This conjecture was called the Asymptotic Quantum Birkhoff Conjecture (AQBC) and was discussed on Werner’s open problems page. A few years later, it was shown to hold for everyone’s favorite unital-but-not-mixture-of-unitaries channel (What’s that? The one with Jamiolkowski state equal to the maximally mixed state on the antisymmetic subspace of $latex mathbb{C}^3 otimes mathbb{C}^3$ of course!) by Mendl and Wolf.
Sadly, the AQBC is also false, as Haagerup and Musat recently proved. However, their paper uses von Neumann algebras, and their abstact begins

We study factorization and dilation properties of Markov maps between von Neumann algebras equipped with normal faithful states, i.e., completely positive unital maps which preserve the given states and also intertwine their automorphism groups.

(Another approach, in preparation, is due to Ostrev, Oza and Shor.)
This raises a new open question: is there a nice simple proof that finite-dimension-loving quantum information people can follow without having to learn any new math? (Note: probably we should learn more about von Neumann algebras. Just like we should make our own pizza from scratch. But if there’s a good pizzeria that delivers, I’d still like to hear about it.)
This result delivers, by disproving the AQBC with an elegant, intuitive proof. The key technical lemma is the kind of tool that I can imagine being useful elsewhere. It states that, for the problem of distinguishing two convex sets of quantum channels, there exists a single measurement strategy that works well in the worst case. This builds on a similar lemma (due to Gutoski-Watrous ‘04 and Jain ‘05) is just minimax applied to state discrimination: If we want to distinguish two convex sets of density matrices, then there exist a single measurement that works well in the worst case. How well? Its performance is at least as good as what we get by choosing the worst pair of states from the two convex sets, and then choosing the optimal measurement based on these.
This paper extends this result to sets of quantum channels. The optimal distinguishability of a pair of quantum channels is given by their diamond-norm distance, which is simply the largest trace distance possible between their outputs when applied to one half of some (possibly entangled) state. So when you want to distinguish convex sets of channels, then they use minimax again to show that there exists a measurement strategy (this time consisting both of an input state to prepare AND a measurement on the output) that works well for any worst-case choice of input channels. This set of measurement strategies sounds potentially non-convex; however, Watrous’s SDP characterization of the diamond norm shows that measurement strategies form a convex set, so everything works out.
Next, this paper applies this to find ways to estimate the distance between a unital channel and the convex hull of unitary channels. To do this, they use a type of channel which they call a Schur channel (more commonly known as Schur multipliers): given a matrix $latex A$, define the channel $latex Phi_A(X) = A circ X$. Such channels are called Schur channels.
Thm:TFAE:

  1. $latex Phi_A$ is a Schur channel
  2. $latex Phi_A$ is unital with diagonal Kraus operators
  3. A is positive and $latex a_{k,k}=1$ for each $latex 1leq kleq d$.

Using Schur channels, and their discrimination lemma, they are able to make short work of the AQBC. Their next result is that any Schur channel for which conv(Kraus operators of the channel) doesn’t include any unitary is a counterexample to AQBC. This will follow from the fact that the closest random-unitary channel to a Schur channel can be taken (up to factor of 2 in the distance) to be a mixture of diagonal unitaries.
This work doesn’t tell us all of the AQBC-violating channels, but it does describe a large class of them.
They also mentioned connections to Grothendieck’s inequality (work in progress) and Connes’ embedding problem. Connections to these two topics were also mentioned in Haagerup and Musat’s work.

Markus Greiner, Quantum Magnetism with Ultracold Atoms – A Microscopic View of Artificial Quantum Matter.

Some motivation: many condensed matter models can’t be tested with current experiments. Even in simple models it is difficult to calculate basic quantities. Want to build new materials. The idea: use ultracold atoms in optical lattices to simulate condensed matter models. The dream of a quantum simulator is to build a special purpose device which is still useful before full-scale quantum computers are built.
He mentioned that quantum simulators are robust to errors… I believe that this is an open question which theorists should be working on.
How do we cool the atoms? First trap the atoms with lasers. For weakly interacting quantum gases, we can form a BEC where all of the particles can be well described by a single wavefunction. But for strongly correlated quantum gases there are interactions which preclude such a description. In an optical lattice, we have atoms interacting strongly on lattice sites. There is new physics here: for example, superfluid to Mott insulator transition. We can use the optical lattice to make synthetic matter: we can simulate electrons in a crystal. With fermionic atoms, we can observe a fermionic superfluid and things like the BEC-BCS crossover. Bose-Hubbard models and high-T_c superconductivity could also be probed in these systems. Also, quantum magnets, low-dimensional materials, etc.
Let’s focus on quantum magnetism. The main experimental tool is the quantum gas microscope. The goal is to take “bottom-up” tools like high-fidelity readout and local control from e.g. ion traps to more “top-down” systems like optical lattices. (This is clearly related to scalability of quantum computation, ultimately.) The quantum gas microscope can image florescence from single atoms trapped in the lattice. To image this, they measure parity.
He then showed a movie of atoms hopping around in a shallow lattice (in the classical regime, obviously). Very cool. Here is one still from the movie:

The Hamiltonian for this system is a hopping term and an interaction term which penalizes multi-site occupations: a basic Bose-Hubbard model. If the hopping terms dominate, then it’s a superfluid; if the on-site repulsion is dominant, we get a Mott insulator. They can get approximately unit site occupation number in the Mott insulator regime.
How to measure interesting quantities, like correlation functions? Measure the site occupation number (mod 2) on neighboring sites. One can push this further and measure string order parameters. In the future, the hope is to measure more complicated things like Chern number.
They observed the Lieb-Robinson light-cone of correlations.
Algorithmic cooling: Suppose that you have random site occupations. Then you can try to “cool” the system by pushing some of the entropy into a subsystem and get a pure state on the rest. They use Rydberg blockade interactions to try to smooth the site occupation number fluctuations, and then transfer to a superfluid. The temperature is not usually constant in their system, but the entropy is consistently low.
Quantum magnetism: Ising spin models. First consider a chain with on-site longitudinal and transverse fields. There are two phases: paramagnetic and antiferromagnetic. By putting a Mott insulator in an electric field, we can observe this type of transition. We can try to extend these results to two dimensional frustrated spin systems. For example, dimerized states (four-state clock model). These models have spin liquid-like groundstates.
Towards a universal quantum computer: The toolbox includes low entropy initialization, single site readout, single site control, and interactions, all with high fidelity. So what does the future hold?
Q: What types of error are these quantum simulators robust to?
A: Temperature is one type of error. Thermalization might not always occur. Sometimes you want some disorder, since real systems also have disorder.
Q: When you say “fidelity” what do you mean by that?
A: We mean the Hamming distance between classical basis states when we talk about paramagnetic order. That is, our fidelity is 99% if a fraction 99/100 of our spins are correctly aligned when we image the sample.
One question that I didn’t get to ask: what can they say about models which have degenerate ground states? I’m obviously thinking about e.g. the toric code and other topologically ordered models. Can they prepare specific pure states in the ground state manifold?

André Chailloux, Optimal Bounds for Quantum Bit Commitment. Joint work with Iordanis Kerenidis, based on 1102.1678.

Some basic primitives for secure transactions:

Unlike QKD, here there are only 2 players here who distrust each other.
Let Alice and Bob’s optimal cheating probabilities be $latex P_A^*$ and $latex P_B^*$.
The best known quantum protocol achieves
$latex max{P_A^*, P_B^*} =3/4$ [Ambainis ‘01]
and the best known lower bound is
$latex max{P_A^*, P_B^*} geq 1/sqrt{2}$ [Kitaev ‘03]
We improve the bound to $latex max{P_A^*, P_B^*} geq 0.739$ and build QBC protocols that get arbitrarily close to this bound.
The proof uses Mochon’s weak coin flipping protocol.

Gilles Brassard, Merkle Puzzles in a quantum world. Joint work with Peter Hoyer, Kassem Kalach, Marc Kaplan, Sophie Laplante, Louis Salvail, based on arXiv:1108.2316

Ralph Merkle was the inventor of Public Key Crypto in 1974 before Diffie and Hellman (but after the classified discovery). Gilles began with a great quote from Merkle:

The human mind treats a new idea the same way as the human body treats a new protein.

See Merkle’s web site for the amazing story of how he proposed public-key cryptography as a class project in grad school, and the professor described it as a “muddled terribly”, and then he submitted it to JACM, which rejected it because it was “outside of the mainstream of cryptographic thinking.”
Key Distribution problem as imagined by Merkle:
Alice and Bob talk over an authenticated public channel. After doing so Alice and Bob should get a shared secret. Can only be computationally secure, but back then it was not even thought to be computationally possible.
Make the eavesdropper’s effort grow as fast as possible compared to Alice and Bob’s effort
We will measure effort by query complexity to a random oracle. Use that model because it allows to prove a lower bound.
Merkle’s 1974 idea:
Choose a hard-to-invert, but injective, function f that acts on $latex [N^2]$.
Alice chooses N random values of x and computes the values of f(x).
Alice sends all these values to Bob.
Bob wants to invert one of them.
He keeps trying his own random x’s until he gets a collision; call it s.
Bob sends f(s) to Alice and Alice finds secret s.
So after N queries by Alice and N by Bob they get the same s
Eavesdropper has seen only N useless f(x) values and one useful f(s) which he has to try $latex N^2/2$ queries to find s, compared to N for Alice and Bob
Ok, so this is only a quadratic separation, but we don’t actually know that DIffie-Hellman is any better. The lower bound was proven by Narak and Mahmoody 08.
What about the quantum world, where the eavesdropper is quantum? Alice and Bob could be quantum but don’t need to be. The channel remains purely classical, so A&B can’t do QKD
If Eve is quantum, and Alice and Bob are classical, then Eve can find the key in O(N) time, which is the same amount of effort that Alice and Bob expend (albeit quantum and not classical). Unfortunately, this breaks the Merkle scheme completely.
It can it be repaired by allowing A&B to be quantum, or even keeping them classical
One option is to keep Alice classical, use a space of size $latex N^3$, and allow Bob to use BBHT to find one preimage in O(N) quantum queries. Eve using Grover needs $N^{3/2}$ so Alice and Bob get an advantage over Eve.
An improved protocol is for Bob to find two preimages in O(N) quantum queries.
Bob sends Alice the bitwise XOR of f(s) and f(s’). Given this Alice uses a table to find the secret.
This gives effort $latex O(N^{5/3})$ for Eve and O(N) for Alice and Bob. (It’s not trivial to show Eve can do this well, and even less to show this is optimal)
The proof involves something similar to Ambainis’ element distinctness algorithm with a quantum walk on the Johnson graph. A crucial observation relates to the compositon of our variant of element distinctness on N elements with searching each function value in a set of size $latex N^2$. This involves a new composition theorem for non-boolean functions.
What if Bob is classical?
Like before, there are two functions f and t, each with N^2 points.
Bob finds two preimages as in Merkle. The rest is the same: bitwise XOR.
How much time does quantum eavesdropper need? $latex N^{7/6}$.
Also, there is a sequence of protocols for which best attack we can find tends to $latex N^2$ in the limit. See paper for details. 🙂

Salman Beigi,
Simplified instantaneous non-local quantum computation with applications to position-based cryptography, joint work with Robert Koenig.

Position based crypto (ie authentication of position) by distance bounding techniques
Challenge response protocol
V1 ———-P—————–V2
Verifies send challenges timed for synch arrival at r0, position of verifier
honest prover computes and sends answers
Challenge/response susceptible to cheating by cheating provers each copying and sending the challenge. This shows why classical protocols are insecure.
Kent Munro Spiller 2011 with in- and security proofs
Cheating strategy uses teleportation
In general a quantum protocol is a bipartite unitary.
An attack is a combination of local operations, shared entanglement and one round of communication sufficient to simulate it.
Vaidman 03 achievse this with double exponential cost in ebits for recursive teleportation
Buhrman et al 09 key insights:
Simplified protocol uses $latex O(2^n)$ ebits
Uses port-based teleportation, due to Ishizaka and Hiroshima (PRL 2009 2009)
Share K maxly entangled states
Highly complicated measurement.
Recovery is very simple. Alice discards all but one of her halves of entangled states
New protocol for instantaneous measurementt
Position based crypto: cheating landscape
Vaidman -doubly exponential can always cheat
Port-based single exponential, suff for cheating
If restricted to classical communication and only linear entanglement can get exponential soundness for Position Based Crypto
If cheaters are allowed q commun and linear entanglement can get only constant

Florian Speelman, Garden Hose Game and Application to Position-Based Quantum Cryptography. Joint work with Harry Buhrman, Serge Fehr, Christian Schaffner, based on arXiv:1109.2563.

Position verification applications:

  • launching nuclear missiles (against enemies not equipped with FTL neutrinos, of course)
  • preventing prank calls of ambulances, SWAT teams or pizzas
  • communicating with right wifi router
  • computing international roaming fees properly
  • for when you want to know you are talking to S. Korea and not N. Korea
  • proving that “live”-blogging is actually occurring from the conference location, and is not just the result of typing out our previous opinions of the results.

On the positive side, can show security if promised there is no shared entanglement,
Show a specific class of schemes to obtain a tradeoff. Increased classical communication for honest players
Example: classically secure but broken by teleportation
Verifier 0 sends psi to Prover
Verifier 1 sends bit to prover
Prover routes psi left or right according to bit
Instead of one bit use a function
V0 sends state and n-bit string x to prover
V1 sends an n bit string y to prover
Prover computes fn f(x,y) and sends psi to V0 or V1 depending on outcome
Nice thing is that it’s no harder for honest prover
But for cheating provers, more entanglement is needed. Can do it with $latex 2times 2^n$ ebits
Garden hose model
A discrete way of looking at attacks of this sort
Alice and Bob share s pipes between them. Alice has a water tap
Alice use hoses like Enigma plug cords to connect their ends ends of hoses in some pattern
f(x,y) is whether water comes out on left or right
Gardenhose complexity of a function is number of pipes needed to compute the fn
Every piece of hose is a teleportation
Can use GH model to prove upper bounds
Equality 2n+O(log n)
Inner Product 4n
Majority n^2
If f is in logspace, then GH(f) is poly(n).
But there do exist fns with exponential GH complexity.
Can prove lower bounds for explicit functions, but best we can prove are linear.

Poster Session 1

The poster session had tons of great posters that we will not remotely do justice. Instead, here are some photos from a few of them. Try to guess which ones. 🙂


Open Science and the Quantumista Condensate

A rare event occurred today here in Seattle. And I’m not just talking about the 20 minutes of partly sunny skies we got at lunchtime. No, this was something rarer still: a quantumista condensate.
What precipitated this joint gathering of the decohered and the coherent? Michael Nielsen was visiting the University of Washington to deliver a distinguished lecture on the topic of his new book: open science. Having seen Michael talk about this subject 3 years ago at QIP Santa Fe, I can say that he has significantly focused his ideas and his message. He makes a very compelling case for open science and in particular open data. He has thought very hard about what makes online collaborative science projects successful at focusing and amplifying our collective intelligence, why such projects sometimes fail, and which steps we need to take to get to the promised land from where we are currently.
The talk was recorded, and as soon as the video becomes available I’ll put a link here. I highly recommend watching it.
Update (12/12): Here is the link to Michael’s talk.
You might be wondering, what is the optimizer doing there? He is in town to give the colloquium to the computer science department. And given all the excitement, Dave Bacon, aka Pontiff++, couldn’t help but sneak over from Google to check things out. He is the one you can blame for coining the horrible phrase “quantumista condensate”, but you probably already guessed that.

Why the laity hope Einstein was wrong.

Although reputable news sources pointed out that most scientists think some more mundane explanation will be found for the too-early arrival of CERN-generated neutrinos in Gran Sasso, recently confirmed by a second round of experiments with much briefer pulse durations to exclude the most likely sources of systematic error, the take-home message for most non-scientists seems to have been “Einstein was wrong.  Things can go faster than light.”  Scientists trying to explain their skepticism often end up sounding closed-minded and arrogant.  People say, “Why don’t you take evidence of faster-than-light travel at face value, rather than saying it must be wrong because it disagrees with Einstein.”  The macho desire not to be bound by an arbitrary speed limit doubtless also helps explain why warp drives are such a staple of  science fiction.  At a recent dinner party, as my wife silently reminded me that a lecture on time dilation and Fitzgerald contraction would be inappropriate, the best I could come up with was an analogy to another branch of physics where where lay peoples’ intuition accords better with that of specialists:  I told them, without giving them any reason to believe me, that Einstein showed that faster-than-light travel would be about as far-reaching and disruptive in its consequences as an engine that required no fuel.
That was too crude an analogy. Certainly a fuelless engine, if it could be built, would be more disruptive in its practical consequences, whereas faster-than-light neutrinos could be accommodated, without creating any paradoxes of time travel, if there were a preferred reference frame within which neutrinos traveling through rock could go faster than light, while other particles, including neutrinos traveling though empty space, would behave in the usual Lorentz-invariant fashion supported by innumerable experiments and astronomical observations.
But it is wrong to blame mere populist distrust of authority for this disconnect between lay and expert opinion. Rather the fault lies with a failure of science education, leaving the public with a good intuition for Galilean relativity, but little understanding of how it has been superseded by special relativity.  So maybe, after dinner is over and my audience is no longer captive, I should retell the old story of cosmic ray-generated muons, who see the onrushing earth as having an atmosphere only a few feet thick, while terrestrial observers see the muons’ lifetime as having been extended manyfold by time dilation.
It is this difference in appreciation of special relativity that accounts for the fact that for most  people, faster-than-light travel seems far more plausible than time travel, whereas for experts, time travel, via closed timelike curves of general relativistic origin, is more plausible than faster-than-light travel in flat spacetime.