The open access wars

Vox has just published an excellent article on open access scientific publishing, “The open access wars“. While that is too provocative of a title, the article still manages to give an accurate assessment of the current state of play. Although I like the article, I can’t help but nitpick a few points.

From the article,

“We spoke with executives at both Elsevier and Springer Nature, and they maintain their companies still provide a lot of value in ensuring the quality of academic research.”

This is false. Publishers do not add any significant value in ensuring the quality of the academic research. Peer reviewers do that, and even then not consistently. True, the publishers facilitate finding peer reviewers, but this has nothing to do with the actual publishing of the research. The role of the journal itself (sans peer review) is just to market and advertise the research, not to ensure quality. The journal also provides a venue for signaling for the prestige-smitten researcher. It is a personal value judgement how much these other things matter, but they certainly don’t impact the quality of the research.

Organizing the peer review (as opposed to actually reviewing) is the job of a middleman: it may provide a service, but it doesn’t add value to the product and it only drives up prices. This is why non-profit overlay journals like Quantum only cost between 0 and €200 to publish. The average cost per submission for hosting a preprint on the arxiv is less than $7.

Which brings me to my next point: I was also a little disappointed in the article that they failed to mention arxiv.org. They do mention the rise of preprints, but they only mention prepubmed.org, which apparently only came online in 2007. By contrast, the oldest arxiv preprint that I’m aware of is Paul Ginsparg’s notes on conformal field theory, which were posted in 1988!

That might be the oldest timestamp, but the arxiv only started having regular preprint service in 1991. Still, this means that essentially all physics research in the last 25+ years is available for free online. In practice, this means that any time you need a physics paper, you simply find the associated preprint and read that instead of the journal version. This is especially convenient for a field like quantum information, where all but a handful of papers are available on the arxiv.

Any article on open access should lead with a discussion of the arxiv. It’s one of the most important science-related developments of the last 30 years.

Quantum Advantage

Update (22 May 2017): This Scirate thread seems to have touched a nerve. Since this was previously buried in the comments here, it’s worth promoting to the top of the post. I think that “quantum computational supremacy” addresses the concern. Basically, we use “quantum” as an adjective for our peer group, which makes the analogy to “white” too strong. Adding “computational” emphasizes that it is the computation, not the people, that are supreme.


I’ve had quite a few conversations lately about a comment I left on Scirate. The paper at that link, “Quantum advantage with shallow circuits” by Sergey Bravyi, David Gosset, Robert Koenig, shows a provable separation between analogous classes of quantum and classical circuits, even when the quantum circuit is restricted to nearest-neighbor gates on a 2D grid. This is a fantastic result! My comment, however, wasn’t regarding the result, but rather the title of the paper. I’m just happy that they called it a “quantum advantage” instead of using that other term…
The term “quantum supremacy” is the fashionable name for the quantum experiments attempting to beat classical computers at some given task, not necessarily a useful one. According to current usage, the term (strangely) only applies to computational problems. The theoretical and experimental work towards demonstrating this is wonderful. But the term itself, as any native English speaker can tell you, has the unfortunate feature that it immediately calls to mind “white supremacy”. Indeed, one can even quantify this using a Google ngram search for *_ADJ supremacy over all books in Google’s corpus between 1900 and 2008:

None of these terms has a particularly good connotation, but white supremacy (the worst on the list) is an order of magnitude more common than the others and has, on net, been growing since the 30s. For almost every native speaker that I’ve talked to, and quite a few non-native speakers as well, the taint of this is hard to escape. (For speakers of German or French, this word is a bit like “Vormachtstellung” or “collaboration” respectively.)
The humor surrounding this term has always been in bad taste — talking about “quantum supremacists” and jokes about disavowing their support — but it was perhaps tolerable before the US election in November. Given that there are several viable alternatives, for example “quantum advantage” or even “quantum superiority”, can we please agree as a community to abandon this awful term?
This isn’t about being PC. And I’m not trying to shame any of the people that have used this term. It’s just a poor word choice, and we don’t have to be stuck with it. Connotations of words matter: you don’t say someone is “scrawny” if you mean they are thin, even though my thesaurus lists these words as synonyms. Given the readily available alternatives, the only case I can think of for “supremacy” at this point is inertia, which is a rather poor argument.
So please, say it with me now: quantum advantage.
Update: Ashley Montanaro points out that “advantage” should potentially be reserved for a slight advantage. I maintain that “superiority” is still a good choice, and I also offer “dominance” as another alternative. Martin Schwarz suggests some variation of “breaking the X barrier”, which has a nice feel to it. 

QIP 2015 dead-blogging, Day 4

Warning for typical QIP attendees —
This first talk may have some implications for the real world 🙂

Andrea Mari, Vittorio Giovannetti, Alexander S. Holevo, R. Garcia-Patron and N. J. Cerf.
Majorization and entropy at the output of bosonic Gaussian channels.
Previous Title: Quantum state majorization at the output of bosonic Gaussian channels (Plenary Talk)
abstract arXiv:1312.3545

The solution of the Gaussian optimizer conjecture by Giovannetti, Holevo, and Garcia-Patron has led to enormous progress on a web of interconnected connected conjectures about Gaussian channels. This includes the phase space majorization conjecture, the additivity and minimum output Renyi entropies conjectures, strong-converse theorems for the classical capacity of gaussian channels, the entropy-power inequality conjecture, and more! The only remaining open questions are the entropy-photon number inequality and the quantum capacity.
The proof uses a decomposition of Gaussian channels. Every phase-insensitive Gaussian channel is equivalent to a quantum-limited attenuator followed by a quantum-limited amplifier. Every phase-contravariant Gaussian channel is equivalent to a quantum-limited attenuator followed by a quantum-limited amplifier followed by a noiseless phase conjugation.
It also uses an old result about quantum beam splitters. How can the output of a beamsplitter be distinguished from two independent broadcasts of the same source? Answer: only if the broadcast was a coherent state. Therefore we have the “golden property” of a quantum beam splitter: the only input states producing pure output states for a quantum limited attenuator are coherent states.
The main result of this talk is a proof of the majorization conjecture. This addresses the question: what is the minimum noise or disorder achievable at the output state, optimizing over all possible input states? The output from a coherent input state majorizes all other output states. Namely, if \Phi is a Gaussian channel, then
\Phi(\lvert\alpha\rangle\!\langle\alpha\rvert) \succ \Phi(\rho) for all \rho,  \lvert\alpha\rangle.
The proof uses the decomposition results mentioned above, concavity of entropy, and the breakthrough results of Giovannetti, Holevo, and Garcia-Patron. A similar result holds for any Renyi p entropy for p > 1. There is also a phase space majorization result.
Here is an awesome summary of what this research project has accomplished.
solved.


Runyao Duan and Andreas Winter. No-Signalling Assisted Zero-Error Capacity of Quantum Channels and an Information Theoretic Interpretation of the Lovasz Number. Previous title: Zero-Error Classical Channel Capacity and Simulation Cost Assisted by Quantum No-Signalling Correlations
abstract arXiv:1409.3426

Coming the morning after the conference dinner, Andreas beings with a “hangover summary” of the main results:

  1. C_0(G) \le \log \vartheta(G) (sometimes known to be strict)
  2. C_{0,E}(G) \leq \log \vartheta(G)
  3. C_{0,NS}(G) = \log \vartheta(G)

On the LHS we have the asymptotic zero-error capacity of a channel, assisted either by nothing (C_0), entanglement (C_{0,E}) or no-signalling correlations (C_{0,NS}). On the RHS we have the Lovasz theta number (see below) which can be calculated efficiently using semidefinite programming.
Classical zero-error capacity depends only on the transition graph, not the probabilities. A useful notion is that of confusability graph, which is an undirected graph on the input symbols of the channel where two symbols are confusable if there is a nonzero probability of confusing them. This encodes the combinatorics of the channel in a natural way. For example, product channels have a product confusability graph whose adjacency matrix is the tensor product of the confusability graphs of the factor channels. (More precisely I+ A(N\times N') = (I+A(N)) \otimes (I + A(N')).)
The largest codebook for this channel is given by the independence number of the confusability graph; this gives the optimal zero-error capacity. It’s unfortunately NP-complete to compute this property of a graph, and even approximating it is NP hard. However, a seminal result due to Lovasz gives a convex relaxation of this into a semidefinite program, the Lovasz theta number.
\alpha(G) \leq \vartheta(G) = \max \mathrm{Tr}(BJ)  \mbox{ s.t. } B\succeq 0,\ \mathrm{Tr} B=1,\ B_{xy} = 0 \mbox{ for all } x,y \in G
Another useful bound is the so-called fractional packing number, obtained by writing down the natural integer program for independence number and replacing the w_x\in \{0,1\} constraint with 0\leq w_x \leq 1, then observing that the upper bound is superlative:
\leq \alpha^*(\Gamma) = \max_x w_x \mbox{ s.t. } w_x \geq 0, \mbox{ and } \forall y \sum_x \Gamma(y|x) w_x \leq 1.
Minimizing over \Gamma consistent with G gives \alpha^*(G) = \min_{\Gamma\sim G} \alpha^*(\Gamma).
Crucially, both the Lovasz theta number and the fractional packing number are multiplicative, and therefore they bound not only the independence number, but also the capacity. Andreas said something surprising: If f(G) is multiplicative and satisfies \alpha(G) \leq f(G) \leq \vartheta(G), then we must have f(G) = \vartheta(G)! This is unpublished still, but follows from the beautiful formula [personal communication from Winter, via my memory, so hopefully not too wrong]:
\vartheta(G) = \sup_H \frac{\alpha(G\times H)}{\vartheta(H)}.
Compare with a similar formula for the fractional packing number:
\alpha^*(G) = \sup_H \frac{\alpha(G\times H)}{\alpha(H)}.
An example, in fact the usual example, is the “typewriter graph”. This has an input x\in\mathbb{Z}_5 and output y \in \{x,x+1\}. Then \alpha(G) = 2, \alpha(G\times G) =5 > \alpha(G)^2, \vartheta(G) = \sqrt{5} (thus giving a tight upper bound on the capacity) and \alpha^*(G) = 5/2.
We would like to close the gap provided by the above relaxations by allowing additional resources in the encoding and decoding. Shannon did this in his 1956 paper by allowing feedback, and we can also add entanglement and no-signaling correlations.
The Lovasz theta number is in fact an upper bound to the entanglement-assisted zero-error capacity, which sometimes is larger than the unassisted zero-error capacity. This gives an operational (but partial) explanation of why the Lovasz theta function is not always a tight upper bound.
On the other hand, the fractional packing number is equal to the non-signalling-assisted capacity, which is an awesome and nontrivial result.
Now what about the quantum case? Here things get even cooler, although not everything is as well understood. For quantum channels, the quantum generalizations of the transition and confusability graphs are given by the following correspondences:
K = \text{span}\{E_i\} and S = K^\dag K = \text{span} \{E_i^\dag E_j\}.
The authors have also also generalized (as in Will Matthews’ talk earlier) the notion of no-signalling operations to ones that take quantum inputs and outputs. Just as the classical no-signalling polytope is a small linear program, this quantum set has a nice SDP characterization.
Now consider \Upsilon(K) = \max \sum_x s_X \mbox{ s.t. } 0 \leq R_x \leq s_x \Pi_x^\perp\sum_x (R_x + s_x \Pi_x) = I. This is
\leq A(K) = \max \sum_x s_x \mbox{ s.t. } 0 \leq s_x, \sum_x s_x \Pi_x \leq I.
Then the “most amazing thing”, according to Andreas, is the following theorem:
C_{0,NS}(K) = \lim \frac{1}{n} \log \Upsilon(K^{\otimes n}) = \log A(K).
Note that this is not just an inequality, but an actual equality!
They show actually that \Upsilon(K^{\otimes n}) \ge n^{-c} A(K)^n.
The proof is reminiscent of the recent and inexplicably not-invited Fawzi-Renner breakthrough on the quantum conditional mutual information.
Now in terms of G, instead of K, \min A(K) = \vartheta(G), which is the first information-theoretic application of the Lovasz theta function.
Some of the closing remarks were that

  • SDP formulas for assisted capacity and simulation cost (one shot)
  • SDPs can regularize to relaxed SDPs!
  • Capacity interpretation of Lovasz theta number
  • Is there a gap between C_{0,E}(G) and \log\vartheta(G)?
  • Is regularization necessary?

Mario Berta, Omar Fawzi and Volkher Scholz.
Quantum-proof randomness extractors via operator space theory;
abstract arXiv:1409.3563

This talk is about producing high-quality (i.e. nearly uniform) randomness from [necessarily longer] strings that contain “lower-quality” randomness. Tools for doing this are called extractors; other closely related tools are condensers and expanders. Why would we need to do this? Well suppose we start with some uniform randomness and then leak some information about it to an adversary (Eve). Then Eve’s subjective view of our randomness is no longer uniform. Applying an extractor can make it again close to uniform. Even in a non-cryptographic setting they can be useful. Suppose we run a randomized algorithm using some large uniformly random seed. Conditioned on the output of this algorithm, the seed will no longer be uniformly random. If we want to continue using it, maybe even for later steps of a larger algorithm, we will need to clean it up using something like an extractor.
This establishes the need for extractors and their crucial property: their goal is to make a string random conditioned on some other string (e.g. Eve’s knowledge, or the earlier outputs of the randomized algorithm; call this the “side information”). This talk will consider the case when side information is in fact quantum. Conditioning on quantum things is always tricky, and one hint that the math here will be nontrivial is the fact that “operator space theory” is in the title.
One unpleasant fact about extractors is that they cannot be deterministic (this is a useful exercise for the reader). So they usually use some extra input, guaranteed to be uniform and independent of the source. The smallest possible seed size is logarithmic in the input size. Also the number of output bits can be (more or less) no larger than the min-entropy of the source. So the key parameters are the input size, the input min-entropy, the input seed size, the number of output bits and their trace distance from being uniform and independent of the side information.
One example of a nice classical result is the leftover hash lemma.
Can we still get the same result if adversary is quantum? This will have implications for privacy amplification in q crypto and also (not sure why about this one) properties of q memory.
Here min-entropy becomes the conditional min-entropy, which involves a maximization over all guessing strategies and measures knowledge of an adversary with access to a q system correlated with the source.
The main result here is a mathematical framework to study this, based on operator space theory. Why operator space? We can define C(Ext, k) to be the maximum advantage for a classical adversary against the extractor Ext on sources with k bits of min-entropy. The key insight is that this is a norm, specifically an operator norm. Sources with \geq k bits of min-entropy correspond to probability distributions (i.e. \|x\|_1 \leq 1) with all entries \leq 2^{-k}, (i.e. \|x\|_\infty \leq 2^{-k}). The intersection of these two constraints defines a centrally symmetric convex set (if we subtract off the uniform distribution, I suppose), which can then define a norm. One other example of such a hybrid norm is the family of Sobolev norms. Now the achievable bias is like the maximum 1-norm of the output of such a vector when we act on it with the extractor Ext, which we can think of a linear map. So this is an operator norm, i.e. the norm of Ext is the max of the norm of Ext(x) divided by the norm of x, where we measure the numerator in the l_1 norm and the denominator in this funny hybrid norm.
The advantage of a quantum adversary is Q(Ext, k) which is like the cb (completely bounded) version of the above. Basically suppose that instead of the entries of x being numbers they are matrices. This is kind of like what team Madrid did with nonlocal games.
They also define an SDP relaxation which has the advantage of being efficiently computable.
It follows trivially from the definitions that
C(Ext,k) \leq Q(Ext,k) \leq SDP (Ext, k).
The nontrivial stuff comes from the fact that the SDP relaxations are in some cases analytically tractable, e.g. showing that small-output and high input entropy extractors are quantum-proof (i.e. secure against quantum adversaries).
A nice open question: Given that there is an SDP, can we define a convergent hierarchy?


Richard Cleve, Debbie Leung, Li Liu and Chunhao Wang.
Near-linear construction of exact unitary 2-designs
abstract

The basic idea of a unitary t-design is that it is a discrete set of unitary operators that exactly reproduce the first t moments of the Haar measure, i.e. the uniform measure on the unitary group. This construction lets us sample efficiently and still reproduce important properties of the Haar measure. But we don’t just want constructions with small cardinality, we also want them to have efficient gate complexity.
The speaker jokingly says, “unfortunately, these have a lot of applications.” This includes randomized benchmarking, decoupling and error-tolerant QKD schemes. There is another joke about how decoupling doesn’t mean breaking up (romantic) couples. But if Alice and Bob are entangled, and one passes through a decoupling map, then they end up nearly separable! That sounds like a breakup to me. Note that a weaker randomizing map would merely leave them with a data hiding state, which I suppose corresponds to the facebook status “It’s Complicated.” These jokes are still in focus groups, so please forgive them.
We would like to have unitary designs that have low gate complexity. The main result of this talk is a Clifford-only unitary 2-design that uses O(n \log n \log \log n) gates, but it assumes an extension of the Riemann hypothesis. Without this hypothesis, they also have a non-Clifford construction with the same complexity, and a Clifford-only scheme with complexity O(n \log^2 n \log \log n). Sampling uniformly from the Clifford group has gate complexity O(n^2/\log n), so this is an improvement.
One can construct these 2-designs by writing gates in the form S^{\otimes n} H^{\otimes n} M_r, where the matrix M_r has a decomposition with low complexity that is the main technical contribution of the authors. By using the result that Pauli mixing by conjugation implies unitary 2-design, one only needs to consider certain forms of the matrix M_r. Ronald de Wolf suggested that these results could be improved a bit further by using the fast integer multiplication algorithms due to Fürer.


Carl Miller and Yaoyun Shi.
Universal security for randomness expansion. Previous title: Universal security for quantum contextual devices
abstract arXiv:1411.6608

Carl opens his talk by saying that this result has been a goal of the U of M group for the past four years. It then proceeds in suitably epic fashion by looking up the definition of randomness from The Urban Dictionary (highlight: “A word often misused by morons who don’t know very many other words.”) and moving onto modern cryptographic standards (skipping, though, this gem).
This talk follows the research program laid out by Colbeck in 2006, where he suggested that one might take the output of some CHSH games, verify that the winning probability is higher than the optimal classical value, and then apply a deterministic extractor. Difficulties here involve quantum side information, information locking, and the other nuisances that composable security was invented to address. Vazirani-Vidick showed in 2011 that this was possible if in the honest case the Bell inequality is violated optimally, last QIP Miller and Shi showed it worked with some not-quite-perfect value of the CHSH game and the current result extends this to any beyond-classical value of the game.
One the key technical ingredients is a new uncertainty principle.
We will not fully do it justice here. Roughly it is as follows. Define
Y = \text{tr} [\rho_+^{1+\epsilon} + \rho_-^{1-\epsilon}] / \text{tr}[\rho^{1+\epsilon}] and X = \text{tr} [\rho_0^{1+\epsilon}] / \text{tr} [\rho^{1+\epsilon}] where \{\rho_0,\rho_1\} and \{\rho_+,\rho_-\} are the post-measurement states resulting from a pair of anticommuting measurements on \rho. Given this, the Miller-Shi uncertainty principle states that the pair (X,Y) must fit into a particular region of the plane. (see eq (2.2) of their paper for a bit more detail.)
The proof uses the “uniform convexity of the S_{1+\epsilon} norm” due to Ball, Carlen, & Lieb (see also this note). Carl suggests there are more applications. Fernando Brandao and I [Aram] have one! Stay tuned.
At one point Carl says: “By an inductive argument (which actually takes 20 pages), we see that…”
The result can then be generalized beyond X-Z measurements to any pair of non-commuting measurements.
Now what about totally untrusted devices? These too can be forced, using a VV-like approach, to act in a way as though they are performing non-commutative measurements.
This can be generalized even further to Kochen-Specker inequalties and contextuality games, modulo some mild assumptions on how the device works.
open problems: what resources are used? This uses a linear amount of entanglement, for example. For any experimentalists watching (for whom entanglement is scarce and randomness abundant), this part of the talk must sound funny.

Neil J. Ross and Peter Selinger.
Optimal ancilla-free Clifford+T approximation of z-rotations (Plenary Talk)
abstract arXiv:1403.2975

Solovay-Kitaev was all about geometry, but recently the field of quantum compiling has shifted to algebraic number theory, following the path-breaking 2012 paper by Kliuchnikov, Maslov, and Mosca.
Some number theory: In the ring \mathbb{Z}[\sqrt{2}] define (a+b\sqrt{2})^{\bullet} = a-b\sqrt{2} and the norm N(a) := \sqrt{a^\bullet a}.
Applying the bullet operation (a Galois automorphism of the number field) to a compact interval yields a discrete unbounded set. The 1-d grid problem is, given finite intervals A, B \subset \mathbb{R}, find an element of A\cap B^{\bullet}. Generically there will be O(|A|\cdot |B|) solutions, and these are easy to find when both |A| and |B| are large, corresponding to a fat rectangle (when \mathbb{Z}[2] is viewed as a 2-d lattice). When we have a long thin rectangle we can convert it into this case by rescaling.
The 2-d grid problem: Now we consider \mathbb{Z}[\omega] for \omega=\exp(2\pi i /8). This turns into a more complicated set intersection problem on a 2-d lattice whose details we omit. But just like the rescaling used for 1-d, now we use a more complicated set of shearing operators to transform the problem into one where the solutions are easier to find. But we need an extra property: the “uprightness” of the rectangles.
Main Theorem: Let A and B be two convex sets with non-empty interior. There is is a grid operator G such that both G(A) and G^\bullet(B) are both 1/6 upright. Moreover, it can be efficiently computed.
Theorem [Matsumoto-Amano ’08]
Every Clifford + T single-qubit operator W can be uniquely written in the form
W = (T | eps) (HT | SHT)* C,
where C is a Clifford.
Exact synthesis of the Clifford+T operators was solved by Kliuchinikov, Maslov, and Mosca. If You have W a 2-by-2 unitary operator, then it is Clifford+T iff the matrix elements are all of the form \frac{1}{2^{k/2}} \mathbb{Z}[\omega]) where \omega = \sqrt{i}. Moreover, if \det W = 1, then the T-count of the resulting operator is equal to 2k-2.
The upshot of this is that if you have a factoring oracle then the algorithm gives circuits of optimal T-count. In the absence of such an oracle, then this returns a nearly optimal T-count, namely the second-to-optimal T-count m plus a term of order O(\log\log 1/\epsilon).


Adam Bouland and Scott Aaronson.
Generation of Universal Linear Optics by Any Beamsplitter
abstract arXiv:1310.6718

Are there any interesting sets of beamsplitter-type quantum gates that don’t generate either only permutation matrices or arbitrary unitaries? The main result is that if one has a two-level unitary of determinant 1 and with all entries non-zero, then it densely generates SU(m) or SO(m) for m \ge 3. In other words, any beamsplitter that mixes modes is universal for three or more modes. Another way to say this is that for any beamsplitter b, either b is efficiently classically simulable or else it is universal for quantum optics. This is a nice dichotomy theorem. The proof requires the classification of the finite subgroups of SU(3), which was, surprisingly, completed only in 2013. (It was written down in 1917 as “an exercise for the reader” but educational standards have apparently come down since then.)


Isaac Kim.
On the informational completeness of local observables
abstract arXiv:1405.0137

2014 has seen some of the most exciting news for quantum conditional mutual information since strong subadditivity.
The goal of this talk is to try to find a large class of interesting states S such that quantum state tomography and quantum state verification can be done efficiently. We would like to also have that if a state is in S then we can efficiently verify that fact.
At one point in the intro, Isaac says that a point is “…for the experts, err, for the physicists.” Then he remembers who his audience is. Later he says “for this audience, I don’t have to apologize for the term ‘quantum conditional mutual information'”.
This talk proposes a class S_n that has this verification property in time O(n) where n is the number of particles. Any state in S_n is defined by a set of O(1)-particle local reduced density operators.
Isaac first reviews the approach to matrix product state tomography, which uses the parent Hamiltonian of a matrix product state to show that any (injective) MPS is completely determined by a local set of observables. This result is a generalization of the MPS tomography result to higher dimensions, but with the important difference that there is no need for a global wavefunction at all!
The main result, roughly speaking, is that there exists a certificate \epsilon(\{\rho_k \mathcal{N}_k\}) such that
|\rho - \rho'|_1 \le \epsilon(\{\rho_k \mathcal{N}_k\}).
Local density matrices can (approximately) reconstruct the global state. Previously this was done assuming that the global state is a low-bond-dimension MPS. But that’s in a sense circular because we have to make a (strong) assumption about the global state. This work allows us to do this purely using local observables. Awesome.
The talk proceeds by giving a reconstruction procedure by which a global state \rho' can be reconstructed from the marginals of a state \rho, and by showing conditions under which \rho\approx \rho'. The global state is completely determined by the local reduced density operators if the conditional quantum mutual information is zero.

Theorem 1: If \rho_{AB} = \sigma_{AB} and \rho_{BC} = \sigma_{BC} then
\frac{1}{4} \|\rho_{ABC} - \sigma_{ABC}\|_1^2 \leq I(A:C|B)_\rho + I(A:C|B)_\sigma

This is nice, but the upper bound depends on global properties of the state. (Here we think of AB and BC as “local” and ABC as “global.” Don’t worry, this will scale well to n qubits.)
To proceed we would like to upper bound the QCMI (quantum conditional mutual information) in terms of locally observable quantities. This is a straightforward but clever consequence of strong subadditivity. Specifically
I(A:C|B) \leq S(CF) - S(F) + S(BC) - S(B)
for any system F.
For judicious choice of regions, and assuming a strong form of the area law conjecture (S(A) = a |\partial A| - \gamma + o(1) in 2-d), this bound is effective.
We can then build up our reconstruction qubit-by-qubit until we have covered the entire system.
Applications:

  • quantum state tomography: Measure local observables and check the locally computable upper bound. If the latter is small, then the former contains sufficient information to reconstruct the global wavefunction.
  • quantum state verification: Similar except we want to just check whether our physical state is close to a desired one

Bartek Czech, Patrick Hayden, Nima Lashkari and Brian Swingle.
The information theoretic interpretation of the length of a curve
abstract arXiv:1410.1540

At ultra-high energies, we need a theory of quantum gravity to accurately describe physics. As we move down in energy scales, vanilla quantum field theory and eventually condensed-matter physics become relevant. Quantum information theory comes into play in all three of these arenas!
One of the principal claims of this talk is that quantum information is going to be central to understanding holography, which is the key concept underlying, e.g., the AdS/CFT correspondence. Quantum gravity theories in negatively curved spacetimes (saddle-like geometries) seem to have a duality: one Hilbert space has two theories. The duality dictionary helps us compute quantities of interest in one theory and map the solutions back to the other theory. It turns out that these dictionaries often involve computing entropic quantities.
A funny feature of AdS space is that light can reach infinity and bounce back in finite time! This acts just like boundary conditions (at infinity) and simplifies certain things. Inside the bulk of AdS space, geodesics are (in the Poincare disk model) just circular arcs that meet the boundary at right angles. When there is matter in the bulk, these get deformed. We’d love to be able to interpret these geometric quantities in the bulk back in the CFT on the boundary.
The Ryu-Takayanagi correspondence links the length of a geodesic in the bulk with the entanglement entropy in the CFT on the boundary. This can be related (using e.g. work by Calabrese & Cardy) to the central charge of the CFT. A really neat application of this correspondence is a very simple and completely geometric proof of strong subadditivity due to Headrik and Takayanagi (with the important caveat that it applies only to a restricted class of states, namely those states of CFTs that have gravity duals).
The new result here is to generalize the Ryu-Takayanagi correspondence to general curves, not just geodesics. The main result: there is a correspondence between the length of convex curves in the bulk and the optimal entanglement cost of streaming teleportation communication tasks that Alice and Bob perform on the boundary.

QIP 2015 "live"-blogging, Day 3

We promise we’ll finish posting these soon! Day 3 was only a half day of talks with a free afternoon, and the rainy weather of the first two days finally subsided just in time.

Jean-Pierre Tillich
Decoding Quantum LDPC Codes
abstract

LDPC codes are families of sparse codes, meaning that the stabilizers are all low weight, and we usually also require that each qubit partakes in at most a constant number of stabilizers. That is, the parity check matrix is both row and column sparse with constant sparsity per row and column.
The classical versions of these codes are ubiquitous; your cell phone uses LDPC codes, for example. One of the principle advantages of (classical) LDPC codes is that they have good minimum distance, constant rate, and have fast nearly optimal decoders. A new variant called spatially coupled LDPC codes are a universal way (i.e., independent of the channel) to get capacity for most memoryless channels of interest with a low-complexity decoder.
Unfortunately, the quantum case is largely open still, and the speaker admitted in the first moment that he wasn’t going to solve everything during his talk. Leaving aside the question of distance of quantum LDPC codes, one of the reasons why decoding a quantum code is hard is that any error E has the same syndrome as ES if S is in the stabilizer group. This added degeneracy creates an extra freedom that is difficult to account for in a decoder. You want to find the most likely coset, rather than the most likely error.
In a classical LDPC code, the Tanner graph (the bipartite graph whose nodes are bits and checks with edges whenever a collection of bits partake in the same check) is nearly tree-like (it’s an expander) and so message passing algorithms perform nearly at capacity. In the quantum case, the Tanner graph contains many 4-cycles, and message passing decoders get stuck in loops and don’t converge.
Bravyi, Suchara, and Vargo have recently had a breakthrough where they were able to decode a very special quantum LDPC code near the hashing bound: the toric code. But didn’t we already know how to decode the toric code? Yes, of course, there are many good decoders for this code. However, none of them achieve a performance that decodes X and Z errors simultaneously with performance approximately equal to the maximum a posteriori (MAP) decoder.
There are also a new idea, pioneered for polar codes by Renes, Dupuis, and Renner and expanded by Delfosse and Tillich, to use asymmetric codes and decoders. If you are going to decode X first and then Z, why not use a stronger code for X errors, decode them first, and then use the advantage to help you decode the Z after that? Using this idea, you can provably achieve the hashing bound asymptotically.
He concludes with some bright spots: recent global analysis of local decoders (such as the recent work by Hastings on hyperbolic codes), and tricks like entanglement-assisted codes or spatially coupled quantum LDPC codes might lead to capacity achieving decoders soon.


Fernando Pastawski and Beni Yoshida.
Fault-tolerant logical gates in quantum error-correcting codes
abstract arXiv:1408.1720

A great intro: “My paper is at 1408.1720 + PRA…I forgot the full journal reference. You can look it up, which is what people usually do.” Or they just get it from the arxiv!
Fault-tolerant logical gates are central objects in quantum computation. We need them to suppress the spread of errors in computations. Unfortunately, there is a somewhat stringent no-go theorem due to Eastin and Knill that says you cannot have a universal set of transversal gates. (There are various ways around this theorem, such as recent work by Paetznick and Reichard.)
Recently, Bravyi and Koenig have generalized this work. They extend the notion of transversal gate to the idea of a constant-depth circuit and show that constant-depth circuits restrict the allowable gates performable in topological codes to certain levels of the Gottesman-Chuang hierarchy. The main idea of the proof is to use a cleaning lemma. If you can split your code into m+1 correctable regions, then you can only implement gates in level m of the Clifford hierarchy. In particular, D-dimensional topological codes can be split into D+1 correctable regions, so we can only implement gates at level D in the hierarchy.
What Yoshida and Pastawski have done is show that if you have a code with a loss tolerance threshold p_{\mathrm{loss}} > 1/n for some number n, then any transversal gate must be in P_{n-1}. If you have a P_n logic gate, then p_{\mathrm{loss}} \ge 1/n.
Another result is that if a stabilizer Hamiltonian in 3-dimensions has a fault-tolerantly implementable non-Clifford gate, then the energy barrier is constant. The proof is simple: we can split the 3-d chunk of code into correctable regions that have string-like shapes, so there must exist a string-like logical operator for these codes, hence the energy barrier is constant. Beautifully simple!
If you have a topological stabilizer code in D dimensions with an mth level logical gate, then the code distance is at most d\le O(L^{D+1-m}). This improves a bound of Bravyi-Terhal that didn’t take advantage of this m factor.
Lastly, we can consider subsystem codes. In D-dimensions, fault-tolerant gates are in P_D (as long as the code distance grows at least logarithmically in the system size).


For the remaining session, it turns out that no Pontiffs were in the audience. However, we had mostly seen these results in other venues or read them before. They are great results, so go read the papers! We hope the trackback to the arxiv is a small compensation to the authors for our failure to blog these interesting talks.

Marco Piani and John Watrous.
Einstein-Podolsky-Rosen steering provides the advantage in entanglement-assisted subchannel discrimination with one-way measurements
abstract arXiv:1406.0530


Stefan Baeuml, Matthias Christandl, Karol Horodecki and Andreas Winter.
Limitations on Quantum Key Repeaters
abstract arXiv:1402.5927


William Matthews and Debbie Leung.
On the power of PPT-preserving and non-signalling codes
abstract arXiv:1406.7142

QIP 2015 live-blogging, Day 2

From the team that brought you “QIP 2015 Day 1 liveblogging“, here is the exciting sequel. Will they build a quantum computer? Will any complexity classes collapse? Will any results depend on the validity of the Extended Riemann Hypothesis? Read on and find out!
Praise from a reader of “day 1”:
QIP 2015 liveblogging — it’s almost like being there. Maybe better.

David J. Wineland abstract
Quantum state manipulation of trapped ions

Rather than “bore us” (his words) with experimental details, Dave gave a broad-brush picture of some of the progress that his lab has made over the years at improving the coherence of quantum systems.
Dave gave a history of NIST looking for more accurate clocks. Recently, a trapped near-UV transition of Hg ions at last did better than the continually improving microwave Cs standard.
At a 1994 conference at NIST, they invited Artur Ekert to speak about quantum gates. Cirac and Zoller gave the first detailed proposal for quantum computing with a linear ion trap at about this time. They were quickly able to demonstrate one of these gates in a linear ion trap.
He showed and discussed a picture of the racetrack planar ion-trap array, where ions are moved into position to perform gates throughout the trap. They can move an manipulate the ions using a scheme due to Milburn, Schneider, James, Sorenson, and Molmer that uses position dependent dipole forces. The transverse Ising model can be simulated by applying a moving standing wave to ions in a linear trap; this is a test case for useful simulations.
Other groups at NIST have also done impressive work on quantum simulation. Bollinger’s group has made a self-assembled triangular lattice with Ising-type couplings that we talked about previously here on the Pontiff.
Everyone in the ion trap business is plagued by something called “anomalous heating”, of unknown origin, which gets worse as the length scale gets smaller. Colleagues studying surface science have suggested using an argon ion cannon (damn, that sounds impressive) to blast away impurities in the surface trap electrodes, scrubbing the surface clean. This has reduced anomalous heating 100 fold, but it’s still above all known electronic causes. Using cryogenic cooling helps too, as has been done by Ike Chuang’s group at MIT.
Laser intensity fluctuations at the site of the ions is another continual source of error. Optical and IR beams can be efficiently transmitted and positioned by optical fibers, but UV beams create color centers and degrade optical fiber on a timescale of about an hour. Recent work by the group has shown that this degradation timescale can be extended somewhat.
Dave showed a list, and there are about 30+ groups around the world working on ion-trap quantum information processing. Pretty impressive!
Dave showed this Time magazine cover that calls D-Wave the “Infinity Machine” that no one understands. In contrast, he says, we know how quantum computing works… and how it doesn’t. Sober experimentalists seem to be in rough agreement that

  • A factoring machine is decades away.
  • Quantum simulation may be possible within the next decade.
  • The real excitement will be a simulation that tells us something new about physics.

Joel Wallman and Steve Flammia
Randomized Benchmarking with Confidence
abstract arXiv:1404.6025

Randomized benchmarking is a standard method whereby experimental implementations of quantum gates can be assessed for their average-case accuracy in a way that doesn’t conflate the noise on the gates with the noise of state preparation and measurement (SPAM) errors.
The protocol is simple:

  • Choose a random sequence of m Clifford gates
  • prepare the initial state in computational basis
  • Apply the Clifford gate sequence and then the inverse gate at the end
  • Measure in computational basis.

Repeat this for many random sequences and many repetitions of the each sequence to get statistics. Under a certain noise model called the “0th order model”, the averages of this procedure for different values of m will fit to a model of the form F_m = A + B f^m where f is a quantity closely related to the average quality of the gates in the sequence. Define r to be the average error rate. (Morally, this is equivalent to “1-f”, in the above model, but the actual formula is more complicated.) To understand the convergence of this protocol to an estimate, we need to understand the variance as a function of m,r.
The main contribution is to reduce the variance bound from the trivial bound of O(1) to O(mr). This provides a good guide on how to choose optimal lengths m for experiments, and the bounds are nearly exact in the case of a single qubit. In the parameter range of interest, this improved over previous estimates of the sample complexity by three orders of magnitude.


Fernando Brandao, Marcus Cramer and Madalin Guta
A Berry-Esseen Theorem for Quantum Lattice Systems and the Equivalence of Statistical Mechanical Ensembles
abstract

The full version is not yet on the arxiv, but due to an author mistake, the above link gives the long version of QIP submission. Download it there while you still can!
Quantum many-body systems are pretty wild objects, with states in 2^{10^{23}} dimensions or even worse. But we often have a mental models of them as basically like non-interacting spins. In some cases, the renormalization group and other arguments can partially justify this. One thing that’s true in the case of non-interacting spins is that the density of states is approximately Gaussian. The idea here is to show that this still holds when we replace “non-interacting spins” with something morally similar, such as exponentially decaying correlations, bounded-range interactions, etc.
This way of writing it makes it sound trivial. But major open questions like the area law fit into this framework, and proving most statements is difficult. So technical advances in validating our “finite correlation length looks like non-interacting spins” intuition can be valuable.
Today’s technical advance is a quantum version of the Berry-Esseen theorem. The usual Berry-Esseen theorem gives quantitative bounds on the convergence to the mean that we get from the central limit theorem. Here we consider a lattice version, where we consider spins on a d-dimensional lattice and local observables A and B that act on subsets of spins separated by a distance L. We require a finite correlation length, as we get for example, for all Gibbs states above some critical temperature (or at any nonzero temperature in D=1).
What does a (quantitative) CLT give us beyond mere large deviation bounds? It shows that the density of states (at least those inhabited by the particular state \rho) is roughly Gaussian thereby roughly matching what we would get from a tensor power state. This is somewhat stronger than the “typical subspace”-type guarantees that we would get from a large deviation bounds.
The main application here is an equivalence theorem between the canonical and microcanonical ensembles: i.e. between the Gibbs state and a uniform mixture over an energy band of width O(\sqrt N). These states are far apart in trace distance, but this paper shows that they look similar with respect to sufficiently local observables. If you think this sounds easy, well, then try to prove it yourself, and then once you give up, read this paper.


Michael Kastoryano and Fernando Brandao
Quantum Gibbs Samplers: the commuting case
abstract arXiv:1409.3435

How efficiently can we prepare thermal states on a quantum computer? There is a related question: how does nature prepare states? That is, what is the natural rate for thermalization given a quantum lattice system? There are two possible ways to model thermalization, both of which are computationally efficient. “Davies generators” mean local jumps that can be modeled as local interactions with a Markovian bath at a fixed temperature, while “heat-bath generators” mean that we repeatedly apply the Petz recovery map to small blocks of spins. Call both “Gibbs samplers.”
Consider the setting where you have a system living on a lattice with a bit or qubit on each site, and some memoryless, spatially local, dynamics. Classically the powerful tools of DLR (Dobrushin-Lanford-Ruelle) theory imply a close relation between properties of the dynamics and properties of the stationary state. Specifically, spatial mixing (meaning decaying correlations in the stationary state) can be related to temporal mixing (meaning that the dynamics converge rapidly to the stationary state). (The best reference I know is Martinelli, but for a more CS-friendly version, see also this paper.)
An exact quantum analogy to this cannot be reasonably defined, since the classical definition involves conditioning – which often is the reason classical information theory ideas fail to translate into the quantum case.
One of the first contributions of this work then is to define quantum notions of “weak clustering” (more or less the familiar exponential decay of correlations between well-separated observables) and “strong clustering” (a more complicated definition involving overlapping regions). Then the main result is that there is an intimate connection between the rate of convergence of any quantum algorithm for reaching the Gibbs state and the correlations in the Gibbs state itself. Namely: strong clustering (but not weak clustering) is equivalent to rapid mixing of the Gibbs sampler. Everything here assumes commuting Hamiltonians, by the way. Also, “rapid mixing” is equivalent to the Gibbs sampler being gapped (think of this like the quantum version of being a gapped Markov chain).
One direction is fairly straightforward. To show that strong clustering implies a gapped Gibbs sampler, we directly apply the variational characterization of the gap. (The dynamics of a continuous-time Gibbs sampler can be written as \dot\rho = -\mathcal{A}[\rho] for some linear superoperator \mathcal{A}, which we will assume to be Hermitian for convenience. \mathcal{A} has all nonnegative eigenvalues because it is stable, and it has a single eigenvalue equal to 0, corresponding to the unique stationary distribution. The gap is given by the smallest positive eigenvalue, and this “smallest” is what gives rise to the variational characterization. See their paper for details.) The variational calculation involves a minimization over (global) states and strong clustering lets us reduce this to calculations involving local states that are much easier to bound.
In the other direction (gap implies strong clustering), we relate the Gibbs sampler to a local Hamiltonian, and use the detectability lemma, which in fact was originally used in part to prove a statement about decay of correlations. The idea is to construct an AGSP (approximate ground-state projector) which is a low-degree polynomial of the Hamiltonian. Because it’s low degree, applying it does not increase the entanglement across any cut by much (useful for proving area laws) or does not propagate correlations far (along the lines of Lieb-Robinson; useful for correlation decay).
When can these results be applied? In 1-D, strong and weak clustering are equivalent (because boundary terms can be removed), and therefore both are implied by (Hamiltonian) gap. Also in any number of spatial dimensions, above a universal critical temperature the Gibbs samplers are always gapped.
Some open questions:

  • If in 2-D, one could also show strong=weak clustering (as is known classically in <3 dimensions), it would nail the coffin of 2d quantum memory for any commuting Hamiltonian.
  • Classically, there is a dichotomy result: either there is very rapid mixing (log(N) time) or very slow (exp(N)) time. Here they can only get poly(N) mixing. Can these results be extended to the log-Sobolev type bounds that give this type of result?

Mehmet Burak Şahinoğlu, Dominic Williamson, Nick Bultinck, Michael Marien, Jutho Haegeman, Norbert Schuch and Frank Verstraete
Characterizing Topological Order with Matrix Product Operators
MERGED WITH
Oliver Buerschaper
Matrix Product Operators: Local Equivalence and Topological Order
abstract-137 arXiv:1409.2150 abstract-176

Characterizing topological quantum order is a challenging problem in many-body physics. In two dimensions, it is generally accepted that all topologically ordered ground states are described (in a long-range limit) by a theory of anyons. These anyonic theories have characteristic features like topology-dependent degeneracy and local indistinguishability in the ground space and string-like operators that map between these ground states.
The most famous example of this is Kitaev’s toric code, and we are interested in it at a quantum information conference because of its ability to act as a natural quantum error-correcting code. The four ground states of the toric code can be considered as a loop gas, where each ground state is a uniform superposition of all loops on the torus satisfying a given parity constraint.
The goal in this talk is to classify types of topological order using the formalism of matrix product states, and their slightly more general cousins, matrix product operators (MPO). The authors define an algebra for MPOs that mimics the algebra of loop operators in a topologically ordered material. Because matrix product operators have efficient descriptions classically, they are well suited to numerical studies, and their structure also allows them to be used for analytical investigations.
The main idea that the authors introduce is a condition on MPO operators so that they behave like topological operators. In particular, they obey a “deformation” condition that lets them be pushed around the lattice, just like Wilson loops.
The authors used this idea to study models that are not stabilizer codes, such as the double semion model and more generally the class of string-net models. This looks like a very promising tool for studying topological order.


Dorit Aharonov, Aram Harrow, Zeph Landau, Daniel Nagaj, Mario Szegedy and Umesh Vazirani
Local tests of global entanglement and a counterexample to the generalized area law
abstract 1410.0951

Steve: “Counterexamples to the generalized area law” is an implicit admission that they just disproved something that nobody was conjecturing in the first place. 😉
Aram: I’ll blog about this later.


Xiaotong Ni, Oliver Buerschaper and Maarten Van Den Nest
A non-commuting Stabilizer Formalism
abstract arXiv:1404.5327

This paper introduces a new formalism called the “XS stabilizer” formalism that allows you to describe states in an analogous way to the standard stabilizer formalism, but where the matrices in the group don’t commute. The collection of matrices is generated by X, S, \alpha, where \alpha = \sqrt{i} and S = \sqrt{Z} on n qubits. A state or subspace that is stabilized by a subgroup of these operators is said to be an XS stabilizer state or code. Although these are, as Xiaotong says, “innocent-looking tensor product operators”, the stabilizer states and codes can be very highly entangled.
One of the applications of this formalism is to classify the double semion model, which is a local Hamiltonian model with topological order. There are sets of general conditions for when such states and codes can be ground states of local commuting XS Hamiltonians. Unfortunately, not all of these properties can be computed efficiently; some of these properties are NP-complete to compute. There are some interesting open questions here, for example what class of commuting projector Hamiltonians ground states are in NP?


Dave Touchette
Direct Sum Theorem for Bounded Round Quantum Communication Complexity and a New, Fully Quantum Notion of Information Complexity (Recipient of the QIP2015 Best Student Paper Prize)
abstract arXiv:1409.4391

“Information complexity” is a variant of communication complexity that measures not the number of bits exchanged in a protocol but the amount of “information”, however that is defined. Here is a series of tutorials for the classical case. Entropy is one possibility, since this would put an upper bound on the asymptotic compressibility of many parallel repetitions of a protocol. But in general this gives up too much. If Alice holds random variables AC, Bob holds random variable B and Alice wants to send C to Bob then the cost of this is (asymptotically) I(A:C|B).
This claim has a number of qualifications. It is asymptotic and approximate, meaning that it holds in the limit of many copies. However, see 1410.3031 for a one-shot version. And when communication is measured in qubits, the amount is actually \frac{1}{2} I(A:C|B).
Defining this correctly for multiple messages is tricky. In the classical case, there is a well-defined “transcript” (call it T) of all the messages, and we can define information cost as I(X:T|Y) + I(Y:T|X), where X,Y are the inputs for Alice and Bob respectively. In the quantum case we realize that the very idea of a transcript implicitly uses the principle that (classical) information can be freely copied, and so for quantum protocols we cannot use it. Instead Dave just sums the QCMI (quantum conditional mutual information) of each step of the protocol. This means I(A:M|B) when Alice sends M to Bob and I(B:M|A) when Bob sends A to Alice. Here A,B refer to the entire systems of Alice/Bob respectively. (Earlier work by Yao and Cleve-Buhrman approached this in other, less ideal, ways.)
When minimized over all valid protocols, Dave’s version of Quantum Information Complexity represents exactly the amortized quantum communication complexity. This sounds awesome, but there are a bunch of asterisks. First “minimized over all valid protocols,” is an unbounded minimization (and some of these protocols really do use an infinite number of rounds), although it is in a sense “single-shot” in that it’s considering only protocols for calculating the function once. Also “amortized” here is not quite the same as in Shannon theory. When we talk about the capacity of a channel or its simulation cost (as in these sense of reverse Shannon theorems) we usually demand that the block error rate approach zero. In this case, the information complexity is defined in terms of an error parameter \epsilon (i.e. it is the minimum sum of QCMI’s over all protocols that compute the function up to error \epsilon). This then corresponds to the asymptotic cost of simulating a large number of evaluations of the function, each of which is allowed to err with probability \epsilon. The analogue in Shannon theory is something called rate-distortion theory.
Before you turn up your nose, though, the current talk gets rid of this amortized restriction. QIC (quantum information complexity) is easily seen to be a lower bound for the communication complexity and this work shows that it is also an upper bound. At least up to a multiplicative factor of 1/\epsilon^2 and an additive term that also scales with the number of rounds. Since QIC is also a lower bound for the above amortized version of complexity, this proves a direct sum theorem, meaning that computing n function values costs \Omega(n) as much as one function evaluation. Here the weak amortized definition actually makes the result stronger, since we are proving lower bounds on the communication cost. In other words, the lower bound also applies to the case of low block-wise error.
The technical tools are the one-shot redistribution protocol mentioned above (see also this version) and the Jain-Radhakrishnan-Sen substate theorem (recently reproved in 1103.6067 and the subject of a press release that I suppose justifies calling this a “celebrated” theorem). I should write a blog post about how much I hate it when people refer to “celebrated” theorems. Personally I celebrate things like Thanksgiving and New Year’s, not the PCP theorem. But I digress.


Toby Cubitt, David Elkouss, William Matthews, Maris Ozols, David Perez-Garcia and Sergii Strelchuk
Unbounded number of channel uses are required to see quantum capacity
abstract arXiv:1408.5115

Is the quantum capacity of a quantum channel our field’s version of string theory? Along the lines of this great Peter Shor book review, quantum Shannon theory has yielded some delightful pleasant surprises, but our attempts to prove an analogue of Shannon’s famous formula C=\max_p I(A:B) has turned into a quagmire that has now lasted longer than the Vietnam War.
Today’s talk is the latest grim news on this front. Yes, we have a capacity theorem for the (unassisted) quantum capacity, the famous LSD theorem, but it requires “regularization” meaning maximizing a rescaled entropic quantity over an unbounded number of channel uses. Of course the definition of the capacity itself involves a maximization over an unbounded number of channel uses, so formally speaking we are not better off, although in practice the capacity formula can often give decent lower bounds. On the other hand, we still don’t know if it is even decidable.
Specifically the capacity formula is
\displaystyle Q = \lim_{n\rightarrow\infty} Q^{(n)} := \lim_{n\rightarrow\infty} \frac{1}{n} \max_\rho I_c(\mathcal{N}^{\otimes n}, \rho),
where \rho is maximized over all inputs to n uses of the channel and I_c is the coherent information (see paper for def). In evaluating this formula, how large do we have to take n? e.g. could prove that we always have Q^{(n)} \geq (1-1/n)Q? If this, or some formula like it, were true then we would get an explicit upper bound on the complexity of estimating capacity.
The main result here is to give us bad news on this front, in fairly strong terms. For any n they define a channel for which Q^{(n)}=0 but Q>0.
Thus we need an unbounded number of channel uses to detect whether the quantum capacity (ie the regularized coherent information) is even zero or nonzero.
The talk reviews other non-additivity examples, including classical, private, zero-error quantum and classical capacities. Are there any good review articles here?
Here’s how the construction works. It builds on the Smith-Yard superactivation result, which combines an erasure channel (whose lack of capacity follows from the no-cloning theorem) and a PPT channel (whose lack of capacity follows from being PPT). The PPT channel is chosen to be able to send private information (we know these exist from a paper by H3O) and by using the structure of these states (further developed in later three-Horodecki-and-an-Oppenheim work), one can show that combining this with an erasure channel can send some quantum information. Specifically the PPT channel produces a “shield” which, if faithfully transmitted to Bob, enables perfect quantum communication.
This new construction is similar but uses a shield with many parts any one of which can be used to extract a valid quantum state. On the other hand, the erasure probability is increased nearly to one, and noise is added as well. Proving this is pretty tough and involves sending many things to zero or infinity at varying rates.
During question period prolonged jocular discussion triggered by John Smolin saying title was inappropriate, since the authors had clearly shown that by examining the parameters of the channel the quantum capacity was positive, so detecting positivity of capacity required no channel uses.
D. Gottesman suggested a more operational interpretation of title, given a black box, how many uses of it are needed to decide whether its quantum capacity was positive. If it was, e.g. an erasure channel with erasure probability very close to 1/2, arbitrarily many uses would be needed to confidently decide. It’s not clear how to formalize this model.
By the way, better Shannon theory news is coming in a few days for bosonic channels with the talk by Andrea Mari.

Your Guide to Australian Slang for QIP Sydney

AustralianWhiteIbis gobeirneTo everyone that’s attending QIP, welcome to Sydney!

Since I’ve already had to clarify a number of the finer points of Australian slang to my fellow attendees, I thought I would solve the general problem and simply post a helpful dictionary that translates some uniquely Australian words and usages into standard American English.

Also, this thing on the right is called an ibis. It’s not venomous.

Coffee

Flat white – Try this at least once while you’re here, preferably prepared by a highly skilled barista at one of the better cafes. It’s similar to a latte or to a cappuccino without the foam, but there are important differences.

Long black – Australian version of the Americano, a bit stronger and with crema. It’s the closest you’ll get to a cup of filtered drip coffee, if that’s your thing.

Short black – If you want a standard espresso, order a short black.

The Beach

Thongs – Sandals, or flip-flops. The highest level of dress code in Australia is “no thongs”.

Togs – Swimwear.

Esky – A cooler; the place where you store your beer to keep it cold while you’re getting pissed at the beach.

Pissed – Drunk; the state that a nontrivial fraction of people are in because it’s legal to drink at the beach.

Sunnies – Sunglasses.

Mozzy – Mosquito. Usually not a problem at the beach because there is almost always a breeze.

The Pub

Schooner – (SKOO-ner) A medium-sized glass of beer.

Jug – A pitcher of beer.

Shout – To buy a beer for someone, or a round of beers for your table.

Skol – To chug a beer. Usage: “Hey Robbo, if you skol that schooner I’ll shout you a jug.”

Hotel – In addition to the standard meaning, a hotel is a particular style of pub. It usually has high occupancy and a limited beer selection (though this is starting to improve as craft beer is finally catching on here).

Sports

Football – see “Footy”.

Footy – Rugby. It comes in several varieties, with League and Union being the two most popular varieties.

Gridiron – American football. Not generally watched much down under.

Cricket – An inscrutable game that takes 5 days to play. I think the only way you could like this game is to have the British invade, conquer your land, and occupy your territory under their colonial yoke for at least a few generations. That seems to be how everyone else got into it.

Rooting – Do not make the mistake of saying that you are “rooting for team X”; in Australia, rooting is slang for having sex.

Miscellaneous

Arvo – Afternoon.

Bickie – A cookie or biscuit.

Brekkie – Breakfast.

Fair dinkum – The closest translation is probably “for real”. It’s used to express the sentiment that you’re not deceiving the listener or exaggerating your claims.

Self-correcting Fractals

A really exciting paper appeared on the arxiv today: A proposal for self-correcting stabilizer quantum memories in 3 dimensions (or slightly less), by Courtney Brell. It gives the strongest evidence yet that self-correcting quantum memories are possible in “physically realistic” three-dimensional lattice models. In particular, Courtney has constructed families of local Hamiltonians in 3D whose terms consist of X- and Z-type stabilizer generators and that show phase-transition behavior akin to the 2D Ising model for both the X- and Z-type error sectors. This result doesn’t achieve a complete theoretical solution to the question of whether self-correcting quantum memories can exist in principle, as I’ll explain below, but it makes impressive progress using a mix of rigorous analysis and physical argument.

First, what do I mean by “physically realistic”? Well, obviously I don’t mean physically realistic (without quotes)—that’s a much greater challenge. Rather, we want to abstractly characterize some features that should be shared by a physically realistic implementation, but with enough leeway that a theorist can get creative. To capture this, Courtney introduces the so-called Caltech Rules for a self-correcting quantum memory.

The phrase “the Caltech Rules” is (I believe) attributable to David Poulin. Quantum memory aficionados have been debating these rules in emails and private discussions for the last few years, but I think this is the first time someone has put them in print. As rules, they aren’t really set in stone. They consist of a list of criteria that are either necessary or seemingly necessary to avoid models that are self-correcting for trivial and unphysical reasons (e.g., scaling the coupling strengths as a function of n). In Courtney’s version of the rules, we require a model with finite-dimensional spins (so no bosonic or fermionic models allowed… this might be objectionable to some people), bounded-strength short-range interactions between the spins, a constant density of spins, a perturbatively stable degenerate ground space for the encoded states, an efficient decoding algorithm, and an exponential memory lifetime against low-temperature thermal noise. One might wish to add even more desiderata like translation-invariant couplings or a spectral gap (which is closely related to stability), but finding a self-correcting memory subject to these constraints is already a tall order. For some more discussion on these points, check out another awesome paper that came on the arxiv yesterday, an excellent review article on quantum memories at finite temperature by Ben Brown et al..

To motivate the construction, it helps to remember everyone’s favorite models, the Ising model and the Toric code. When the temperature T is zero, it’s easy to store a classical bit using the 1D Ising model; this is just a repetition code. Similarly, the 2D toric code can store quantum information at T=0. Both of these codes become unstable as memories at T\textgreater 0 because of the presence of string-like logical operators. The physical process by which these strings are created costs some energy, but then the strings can stretch and grow without any energy cost, and thermal fluctuations alone will create enough strings in a short time to cause a decoding failure. By contrast, the 2D Ising model can store a classical bit reliably for an exponential amount of time if you encode in the total magnetization and you are below the Curie temperature. The logical operators are now membranes that cost energy to grow. Similarly, the 4D toric code has such a phase transition, and this is because the X- and Z-type errors both act analogously to 2D Ising models with membranous logical operators.

Sierpinski carpet
Sierpinski carpet, with edges placed to form a “Sierpinski graph”.

The codes that Courtney defines are called embeddable fractal product codes (EFPC). The idea is that, if a product of two 1D Ising models isn’t a 2D self-correcting model, but a product of two 2D Ising models is a self-correcting memory, then what happens if we take two 1.5D Ising models and try to make a 3D self-correcting memory? The backbone of the construction consists of fractals such as the Sierpinski carpet that have infinite ramification order, meaning that an infinite number of edges on an associated graph must be cut to split it into two infinite components. Defining an Ising model on the Sierpinski graph yields a finite-temperature phase transition for the same reason as the 2D Ising model, the Peierls argument, which is essentially a counting argument about the density of domain walls in equilibrium with fixed boundary conditions. This is exactly the kind of behavior needed for self-correction.

cut
Splitting the Sierpinski graph into two infinite components necessarily cuts an infinite number of edges.

Using the adjacency of the Sierpinski graph, the next step is to use a toric code-like set of generators on this graph, paying careful attention to the boundary conditions (in particular, plaquette terms are placed in such a way that the stabilizer group contains all the cycles that bound areas of the fractal, at any length scale). Then using homological product codes gives a natural way to combine X-like and Z-like copies of this code into a new code that naturally lives in four dimensions. Although the natural way to embed this code requires all four spatial dimensions, it turns out that a low-distortion embedding is possible with distortion bounded by a small constant, so these codes can be compressed into three dimensions while retaining the crucial locality properties.

Remarkably, this construction gives a finite-temperature phase transition for both the X- and Z-type errors. It essentially inherits this from the fact that the Ising models on the Sierpinski graph have phase transitions, and it is a very strong indication of self-correcting behavior.

However, there are some caveats. There are many logical qubits in this code (in fact, the code has constant rate), and only the qubits associated to the coarsest features of the fractal have large distance. There are many logical qubits associated to small-scale features that have small distance and create an exponential degeneracy of the ground space. With such a large degeneracy, one worries about perturbative stability in the presence of a generic local perturbation. There are a few other caveats, for example the question of efficient decoding, but to me the issue of the degeneracy is the most interesting.

Overall, this is the most exciting progress since Haah’s cubic code. I think I’m actually becoming optimistic about the possibility of self-correction. It looks like Courtney will be speaking about his paper at QIP this year, so this is yet another reason to make it to Sydney this coming January.

A Breakthrough Donation for Computer Science

Lance Fortnow has a post summarizing some of the news affecting the CS community over the past month, including updates on various prizes as well as the significant media attention focusing on physics- and math-related topics such as movies about Turing and Hawking as well as Terrence Tao on the Colbert Report.

From his post, I just learned that former Microsoft chief executive Steven Ballmer is making a donation to Harvard that will endow twelve—that’s right, 12—new tenured and tenure-track faculty positions in computer science. This is fantastic news and will have a huge positive impact on Harvard CS.

One thing missing from Lance’s list was news about the Breakthrough Prizes in mathematics and fundamental physics. In case you’ve been living under a rock, these prizes give a very hefty US $3 million purse to the chosen recipients. The winners are all luminaries in their field, and it’s great to see them get recognition for their outstanding work.

On the other hand, juxtaposing Ballmer’s donation and the Breakthrough Prizes couldn’t offer a starker contrast. It costs the same amount—$3 million—to endow a university full professor with appointments in more than one discipline at Duke University. My initial googling would suggest that this is a pretty typical figure at top-tier institutions.

What if, instead of a offering a cash prize to the Breakthrough Prize winners, the reward was an upgrade to an endowed chair at the current institution subject to the condition that the existing position would go to a new tenured or tenure-track hire in the same field? This seems to be a much better investment in science overall because it will help build a community of researchers around the prize winner, and the marginal benefit to this community from associating with the prize winner is likely far greater than any extra incentive the researchers might get within the current system to simply strive to win $3M cash.

QIP 2015

Sydney_skyline_at_dusk_-_Dec_2008
The website is up for QIP 2015, which will be held this year in beautiful Sydney, Australia. Here is a timeline of the relevant dates:

  • Submission of talks deadline: Sep 12, 2014
  • Submission of posters deadline: Oct 25, 2014
  • Decision on talks and posters submitted before talk deadline: Oct 20, 2014
  • Decision on posters submitted after talk deadline: Nov 15, 2014
  • Tutorial Session: Jan 10-11, 2015
  • Main Conference: Jan 12-16, 2015

And students, don’t worry, there are plans to include some student support scholarships, so we hope that many of you can attend. We’re looking forward to seeing you all here!