QIP 2012 Day 1

Sergey Bravyi, Topological qubits: stability against thermal noise.
Joint work with Jeongwan Haah, based on arXiv:1105.4159.


Sergey opened his talk with an image of a bagel in a Solfatara landscape, meant to evoke topogically-protected qubits (get it? a bagel?) in a thermal (steam) bath.
The question is whether we can engineer an energy landscape with high barriers between encoded states so as to enable passive quantum error correction; i.e. to create a topologically protected qubit. Here are the main ingredients:

  • Stabilizer codes on lattice with local generators and code distance dapprox L, where L is the lattice size.
  • A code Hamiltonian with energy penalty for states outside logical subspace. The Hamiltonian is defined simply by having a violated stabilizer cost unit energy.
  • Thermal noise: should be described by a Markovian master equation with local jump operators. The spectral density should obey detailed balance; this will be the place where temperature enters.
  • Finally, there should be a decoder. Here we consider the RG (renormalization group) decoder.

Kitaev’s toric code (1997) was the first example of a topological QECC. If you’re not familiar with it, then you should read about that instead of the current results. The key idea is that the errors are strings that wrap around the torus, and so have length at least L. However, since a partial string incurs only a constant energy penalty, the memory time (=amount of time until the state degrades by a constant amount) is e^{-2beta} [arXiv:0810.4584] independent of the lattice size! This is pretty disappointing; we can get this performance by not encoding the qubit at all.
Indeed, any 2D topological stabilizer code has constant energy barrier [0810.1983], which implies that relaxation to thermal equilibrium occurs at a rate independent of system size L.
This no-go result is daunting, but in higher dimensions we can still achieve some nontrivial topological protection. Here’s what we want: A topological stabilizer codes in a D-dimensional array with commuting Pauli stabilizers G_a, each with support of size O(1), and with each qubit acted on by O(1) stabilizers. The ground space of the corresponding Hamiltonian is the set of states that are +1 eigenvectors of every stabilizer. We would like the distance to grow at least linearly with lattice size; such models have intrinsic robustness.
Fortunately, there is a shining example of a spatially local stabilizer code with all the self-correcting properties we want. Unfortunately, it requires D=4; it is the 4-d toric code. Here is its energy landscape.

Defects are loops around error membranes, and any sequence of single-qubit Pauli errors mapping a ground state to an orthogonal one must create roughly L defects at some step. Thus the memory time is exponential in L for low T [0811.0033].
All 3D stabilizer Hamiltonians studied so far have constant energy barrier. Indeed this is true of all 3D codes that are invariant under translation and “scale invariance” [1103.1885]. (Here, scale invariance means that the ground state degeneracy doesn’t depend on the size of the lattice.) Haah’s breakthrough 1101.1962 is the first example of a 3-d stabilizer code without string-like logical operators. In his model, defects cannot move very far.
Our results give self-correcting properties of any topological stabilizer codes that obey the no-strings rule, e.g. Haah’s codes.

  1. Their energy barrier is Omega(log L)
  2. They have partial self correction. Specifically, the Arrhenius law together with the logarithmic energy barrier gives lifetime L^{cbeta}, for any inverse temperature beta>0, and for Lleq L^* for some critical value L^*sim e^{beta/3}. See figure:

This lifetime reaches a maximum value (as a function of L) when L is proportional to e^{beta/3}. Thus, choosing L optimally results in a memory lifetime of T_{mathrm{mem}} sim e^{cbeta^3/3}.
One example of a code with no string-like errors is the 3-d cubic code of Haah. It is obtained by placing two qubits at each site of a cubic lattice and tiling the following stabilizer generators. See figure (from the Bravyi-Haah paper).

The noise model: One can get a Markovian noise model by considering the Davies weak coupling limit, where the Lindblad operator A_{k,omega} transfers energy omega from the system to the bath. The spectral density r(omega) obeys detailed balance: r(omega) = exp(-beta omega) r(omega). Note that this is the only place where temperature enters.
The decoder: After measuring the syndrome, the correction operator is found using ideas from Jim Harrington’s thesis [2004; I can’t find it online] and the Renormalization Group decoder from 1006.1362. See photo.
The strategy is to:

  1. Find connected clusters
  2. For each connected cluster C find the min enclosing box and try to annihilate C by an operator acting within the box.
  3. Stop if no defects left.
  4. If some defects are left, then increase the length by factor of 2 and repeat.

The RG decoder stops when all defects have been annihilated or when length reaches the lattice size, in which case we have failure. We can also have failure when all defects have been annihilated but the system has been returned to a wrong ground state.
The RG decoder can be implemented in time poly(L) (and can be parallelized to run in time log(L)).
We can define the memory time as follows. We time evolve according to the master equation given the local quantum jump Lindblad operators. After some time t, we run the decoder. Then we check the worst-case (over all initial states) of the trace distance between the initial and final state.
| mathcal{D}(rho(t)) - rho(0) |_1 le O(t) N 2^k exp(-beta m) (1+exp(-beta))^N.
Here N is the number of physical qubits, k is the number of logical qubits, t is the time, beta is inverse temperature and m is the number of errors that can be corrected by the decoder (or more precisely, the height of the energy barrier.) Essentially, there is a temperature/energy factor that competes with an entropic factor.
Analyzing it we find that the memory time grows greater than L^(c beta) as long as L le exp(beta/3)
This gives large but finite lower bound on memory time at a fixed temperature.
Is it optimal? Maybe the analysis is pessimistic and the Haah code actually has an exponentially large lifetime? Maybe growing L can result in unbounded memory time when beta is sufficiently large? The authors carried out a Monte Carlo simulation of Haah code with only X errors to test these theories. It used 1000 days of CPU time on IBM’s Blue Gene.
The plots are qualitatively similar to the lower bounds on memory time. In particular, they show that the maximum memory time scales quadratically with beta and that the memory time for fixed temperature increases with L and then starts to decrease.
How do the proofs go?
Idea 1: The no-strings rule implies localization of errors. That is, any error E can be written as E=E_{loc} cdot G where G is a stabilizer and E_{loc} is supported on at most 2 neighborhoods.
In order for the accumulated error to have large weight at least one intermediate syndrome must be nonsparse, with one pair of defects within distance a of each other.
Idea 2: Uses scale invariance of the nostrings rule.
Define sparseness and denseness at different scales. A syndrome which is dense at p consecutive scales will include a cluster of p defects.
Show that to implement a logical operation, at least one intermediate syndrome must be dense at roughly log(L) differnet scales.

Mark Wilde,
Advances in classical communication for network quantum information theory, based on 1109.0802 and 1102.2624

Consider the problem of an interference channel, with two senders and two receivers. This problem is hard enough in the “ccqq” case, meaning that the channel takes two classical inputs to a bipartite quantum output: i.e. x_1,x_2 rightarrow rho_{x_1,x_2}^{B_1B_2}. This is depicted here (figure taken from their paper).

Along the way to understanding this, we’ll need to consider a multiple-access channel, with two senders and one receiver. The achievable rate region is in a sense an optimistic one, bounded by the obvious constraints:

R_1 leq I(X_1;B|X_2)
R_2 leq I(X_1;B|X_2)
R_1 + R_2 leq I(X_1X_2;B)

Why are these obvious? The first two constraints are what we would obtain if the receiver has successfully decoded one message, and both parties get to use that information to design a joint encoder-decoder pair. The last constraint is what we would get if the two senders could work together, apart from the requirement that their average input to the channel should be product across the two senders. Thus, these are natural upper buonds. The nonobvious part is that this can be achieved.
The coding theorem was obtained by Andreas Winter, when he was a grad student (quant-ph/9807019). How does it work? First, we observe that the desired rate region can be obtained as a convex combination of the “corner points” of the rate region: (I(X_1;B), I(X_2;B|X_1)) and (I(X_1;B|X_2), I(X_2;B)).
To achieve one of these corner points, use the fact that the two senders are sending uncorrelated inputs, so we can average over the inputs of one sender (say X_2) to obtain an effective single-sender/single-receiver channel, for which coding is possible at rate $Iatex I(X_1;B)$. Of course, measurement is potentially destructive, but since we are operating in the regime of very low error, we can use the fact that measurements with nearly certain outcomes cause very little damage (the infamous “Tender Measurement Lemma”). (Ok, it’s mostly called the “gentle measurement lemma” but I like “tender.”) Thus, the decoder obtains X_1 and, conditioned on it, can decode X_2. (Clearly the sender knows X_1 as well. Note that this is no longer true for the multiple-sender case.)
Sequential Decoding:. To do decoding, we need to use the typical projector for the average state, as well as the conditionally typical projectors for codewords. The HSW idea is to use the pretty good measurement. The idea of sequential decoding is, in analogy with the classical idea, is to check each codeword sequentially. It works out pretty similarly, with the measurement operators in both cases being lightly distorted versions of the conditionally typical projectors.
The crucial lemma making this possible is from 1109.0802, by Pranab Sen. He calls it a “noncommutative union bound”. The statement is that

1- mathrm{tr}(Pi_1cdotsPi_n rho) le 2 sqrt{sum_{i=1}^n mathrm{tr}((1-Pi_i) rho)}

Successive decoding: In general, we’d like to analyze these problems of multipartite information theory using the traditional single sender-singler receiver tools, like HSW. Since each individual code works with exponentially small error, and the gentle-measurement Lemma states that decoding causes exponentially small damage, we should be able to compose several protocols without much trouble. The only subtlety is that the typical projectors don’t commute, which is where the noncommutative union bound comes in. We apply it to the following “intersection projectors.”
tildePi_{x_1^n, x_2^n} leq Pi Pi_{x_1^n} Pi_{x_1^n, x_2^n} Pi_{x_1^n} Pi
Most important open problem: find a general three-sender quantum simultaneous decoder. Solving this should hopefully yield the insights required to handle an unlimited number of senders.

Nilanjana Datta,
Quantum rate distortion, reverse Shannon theorems and source channel separation,
joint work with Min-Hsiu Hsieh, Mark Wilde, based on 1108.4940.

Compression and transmission: the source coding and noisy coding theorems of Shannon. The fundamental limit on compression is the entropy.
Lossless compression is often too stringent a condition. For example, consider jpg compression, which gives faithful images (to a human eye) despite throwing away lots of bits of information. We consider Rate Distortion Theory, i.e. the theory of lossy data compression. We are interested in the maximum rate of compression given a fixed maximum amount of distortion. Define the rate distortion function:
R_c(D) = minimum classical asymptotic cost for sending many copies of state $rho$ with per-copy distortion leq D, where c is for classical.
For a fixed value of the distortion function 0le D < 1, we work in the storage setting. Define a quantum rate distortion function in the asymptotic limit.
Barnum (in quant-ph/9806065) proved the first lower bound on the quantum rate distortion function:
R_q(D) ge min_{S_rho(N,D)} I_c(rho,N)
where the term on the right is the coherent information and S_rho(N,D) = {N:mathrm{CPTP} : d(rho,N)le D}. But, the coherent information can be negative, so it can’t be tight lower bound in all cases.
Now move to the communication setting. Can define the entanglement-assisted quantum rate distortion function. They can find a single-letter formula for it, given by min_{S_rho(N,D)} frac12 I(R:B)_omega. (In terms of the quantum mutual information.) This is both a lower bound and it is achievable by using quantum state splitting. Alternatively, the achievability follows from the quantum reverse Shannon theorem (QRST).
Unassisted quantum rate distortion function can also be found by the using the QRST. Need to use a regularized formula.

Jon Yard,
Quantum communication with Gaussian channels of zero quantum capacity,
joint work with Graeme Smith and John A. Smolin, based on 1102.4580.

The title on arxiv.org is “Gaussian bosonic synergy: quantum communication via realistic channels of zero quantum capacity”. Realistic? Synergy?? Think about this, kids, before you go off to work in industrial research.
This paper concerns the fascinating topic of channels with zero quantum capacity. For zero classical capacity, these channels are simple to describe. Here is one example.

But zero quantum capacity occurs even for channels whose output depends on the input. For example, consider the completely dephasing channel, aka a classical channel. Obviously this has no classical capacity. The fact that this channel has zero capacity is both because it is anti-degradable (meaning that Eve’s output can simulate Bob’s; this implies 0-capacity by no-cloning) and because it is PPT, meaning it maps every input state to a PPT output state, or equivalently, its Jamiolkowski state (obtained by feeding in half of a maximally entangled state) is PPT. Sadly, these two conditions are still pretty much our only examples of zero-capacity channels. See a previous talk of Graeme’s for a framework that could potentially expand this set of conditions.
Let’s talk a bit more about those two examples. The anti-degradable channels are intuitively those that give more to Eve than Bob. So erasure channels with erasure rate >50% count, attenuation channels (for photons; can be modeled by a beamsplitter) and certain depolarizing channels (using an argument due to Brandao, Oppenheim, and Strelchuk http://arxiv.org/abs/1107.4385). On the other hand, PPT channels are at least easy to calculate, and include some channels with zero private capacity.
The general question of characterizing zero-capacity channels is a very interesting one, and one that it’s not clear if we are up to. But here is a nice specific version of the problem. The capacity of the depolarizing channel drops to zero at noise rate p^*, where 0.2552 leq p^* leq 1/3. What is p^*???
A good quote from Jon.

Superactivation demonstrates that asking about the capacity of a quantum channel is like asking about a person’s IQ: one number isn’t enough to capture everything you might want to know.

The main result: There exist Gaussian channels, each with zero quantum capacity, that can be combined to achieve nonzero capacity. These channels are more or less within the reach of current experiments (not sure about all the decoding), requiring about 10dB of squeezing and about 60 photons/channel. This is interesting in part because Gaussian channels seemed to be so well-behaved! For example, there is no NPT bound entanglement in Gaussian channels.
Infinite dimensions: This paper works in infinite dimensions, which many in our field are inherently uncomfortable with, and others are inexplicably drawn to. Like representation theory, or complexity classes with lots of boldface capital letters, the presence of phrases like “Quasi-free maps on the CCR algebra” can be either a good or a bad thing, depending on your perspective.
Q: For those of us who of prefer finite dimensions, can we truncate the Hilbert space, perhaps by taking advantage of a constraint on the total number of photons?
A: In principle yes, but then the channels are no longer Gaussian, so our analysis doesn’t work, and in general, Gaussian things are easier to analyze, so that is a sense in which infinite dimensions can actually be simpler to deal with.

Runyao Duan,
Bounds on the distance between a unital quantum channel and the convex hull of unitary channels, with applications to the asymptotic quantum Birkhoff conjecture,
joint work with Nengkun Yu and Quanhua Xu.

The story behind this work starts with the Birkhoff-von Neumann theorem (often called Birkhoff’s theorem) which states that doubly stochastic matrices (matrices with nonnegative entries and all rows and columns summing to one) are convex combinations of permutations. An analogous claim for quantum channels is that unital channels (i.e. mapping the maximally mixed state to itself) are mixtures of unitaries. However, this claim is false. More recently, Smolin, Verstraete and Winter conjectured in quant-ph/0505038 that many copes of a unital channel should asymptotically approach convex combinations of unitaries. (The rest of their paper contained evidence suggestive of this claim, and in general has nice results that are an important precursor of the merging revolution in quantum information theory.) This conjecture was called the Asymptotic Quantum Birkhoff Conjecture (AQBC) and was discussed on Werner’s open problems page. A few years later, it was shown to hold for everyone’s favorite unital-but-not-mixture-of-unitaries channel (What’s that? The one with Jamiolkowski state equal to the maximally mixed state on the antisymmetic subspace of mathbb{C}^3 otimes mathbb{C}^3 of course!) by Mendl and Wolf.
Sadly, the AQBC is also false, as Haagerup and Musat recently proved. However, their paper uses von Neumann algebras, and their abstact begins

We study factorization and dilation properties of Markov maps between von Neumann algebras equipped with normal faithful states, i.e., completely positive unital maps which preserve the given states and also intertwine their automorphism groups.

(Another approach, in preparation, is due to Ostrev, Oza and Shor.)
This raises a new open question: is there a nice simple proof that finite-dimension-loving quantum information people can follow without having to learn any new math? (Note: probably we should learn more about von Neumann algebras. Just like we should make our own pizza from scratch. But if there’s a good pizzeria that delivers, I’d still like to hear about it.)
This result delivers, by disproving the AQBC with an elegant, intuitive proof. The key technical lemma is the kind of tool that I can imagine being useful elsewhere. It states that, for the problem of distinguishing two convex sets of quantum channels, there exists a single measurement strategy that works well in the worst case. This builds on a similar lemma (due to Gutoski-Watrous ‘04 and Jain ‘05) is just minimax applied to state discrimination: If we want to distinguish two convex sets of density matrices, then there exist a single measurement that works well in the worst case. How well? Its performance is at least as good as what we get by choosing the worst pair of states from the two convex sets, and then choosing the optimal measurement based on these.
This paper extends this result to sets of quantum channels. The optimal distinguishability of a pair of quantum channels is given by their diamond-norm distance, which is simply the largest trace distance possible between their outputs when applied to one half of some (possibly entangled) state. So when you want to distinguish convex sets of channels, then they use minimax again to show that there exists a measurement strategy (this time consisting both of an input state to prepare AND a measurement on the output) that works well for any worst-case choice of input channels. This set of measurement strategies sounds potentially non-convex; however, Watrous’s SDP characterization of the diamond norm shows that measurement strategies form a convex set, so everything works out.
Next, this paper applies this to find ways to estimate the distance between a unital channel and the convex hull of unitary channels. To do this, they use a type of channel which they call a Schur channel (more commonly known as Schur multipliers): given a matrix A, define the channel Phi_A(X) = A circ X. Such channels are called Schur channels.
Thm:TFAE:

  1. Phi_A is a Schur channel
  2. Phi_A is unital with diagonal Kraus operators
  3. A is positive and a_{k,k}=1 for each 1leq kleq d.

Using Schur channels, and their discrimination lemma, they are able to make short work of the AQBC. Their next result is that any Schur channel for which conv(Kraus operators of the channel) doesn’t include any unitary is a counterexample to AQBC. This will follow from the fact that the closest random-unitary channel to a Schur channel can be taken (up to factor of 2 in the distance) to be a mixture of diagonal unitaries.
This work doesn’t tell us all of the AQBC-violating channels, but it does describe a large class of them.
They also mentioned connections to Grothendieck’s inequality (work in progress) and Connes’ embedding problem. Connections to these two topics were also mentioned in Haagerup and Musat’s work.

Markus Greiner, Quantum Magnetism with Ultracold Atoms – A Microscopic View of Artificial Quantum Matter.

Some motivation: many condensed matter models can’t be tested with current experiments. Even in simple models it is difficult to calculate basic quantities. Want to build new materials. The idea: use ultracold atoms in optical lattices to simulate condensed matter models. The dream of a quantum simulator is to build a special purpose device which is still useful before full-scale quantum computers are built.
He mentioned that quantum simulators are robust to errors… I believe that this is an open question which theorists should be working on.
How do we cool the atoms? First trap the atoms with lasers. For weakly interacting quantum gases, we can form a BEC where all of the particles can be well described by a single wavefunction. But for strongly correlated quantum gases there are interactions which preclude such a description. In an optical lattice, we have atoms interacting strongly on lattice sites. There is new physics here: for example, superfluid to Mott insulator transition. We can use the optical lattice to make synthetic matter: we can simulate electrons in a crystal. With fermionic atoms, we can observe a fermionic superfluid and things like the BEC-BCS crossover. Bose-Hubbard models and high-T_c superconductivity could also be probed in these systems. Also, quantum magnets, low-dimensional materials, etc.
Let’s focus on quantum magnetism. The main experimental tool is the quantum gas microscope. The goal is to take “bottom-up” tools like high-fidelity readout and local control from e.g. ion traps to more “top-down” systems like optical lattices. (This is clearly related to scalability of quantum computation, ultimately.) The quantum gas microscope can image florescence from single atoms trapped in the lattice. To image this, they measure parity.
He then showed a movie of atoms hopping around in a shallow lattice (in the classical regime, obviously). Very cool. Here is one still from the movie:

The Hamiltonian for this system is a hopping term and an interaction term which penalizes multi-site occupations: a basic Bose-Hubbard model. If the hopping terms dominate, then it’s a superfluid; if the on-site repulsion is dominant, we get a Mott insulator. They can get approximately unit site occupation number in the Mott insulator regime.
How to measure interesting quantities, like correlation functions? Measure the site occupation number (mod 2) on neighboring sites. One can push this further and measure string order parameters. In the future, the hope is to measure more complicated things like Chern number.
They observed the Lieb-Robinson light-cone of correlations.
Algorithmic cooling: Suppose that you have random site occupations. Then you can try to “cool” the system by pushing some of the entropy into a subsystem and get a pure state on the rest. They use Rydberg blockade interactions to try to smooth the site occupation number fluctuations, and then transfer to a superfluid. The temperature is not usually constant in their system, but the entropy is consistently low.
Quantum magnetism: Ising spin models. First consider a chain with on-site longitudinal and transverse fields. There are two phases: paramagnetic and antiferromagnetic. By putting a Mott insulator in an electric field, we can observe this type of transition. We can try to extend these results to two dimensional frustrated spin systems. For example, dimerized states (four-state clock model). These models have spin liquid-like groundstates.
Towards a universal quantum computer: The toolbox includes low entropy initialization, single site readout, single site control, and interactions, all with high fidelity. So what does the future hold?
Q: What types of error are these quantum simulators robust to?
A: Temperature is one type of error. Thermalization might not always occur. Sometimes you want some disorder, since real systems also have disorder.
Q: When you say “fidelity” what do you mean by that?
A: We mean the Hamming distance between classical basis states when we talk about paramagnetic order. That is, our fidelity is 99% if a fraction 99/100 of our spins are correctly aligned when we image the sample.
One question that I didn’t get to ask: what can they say about models which have degenerate ground states? I’m obviously thinking about e.g. the toric code and other topologically ordered models. Can they prepare specific pure states in the ground state manifold?

André Chailloux, Optimal Bounds for Quantum Bit Commitment. Joint work with Iordanis Kerenidis, based on 1102.1678.

Some basic primitives for secure transactions:

Unlike QKD, here there are only 2 players here who distrust each other.
Let Alice and Bob’s optimal cheating probabilities be P_A^* and P_B^*.
The best known quantum protocol achieves
max{P_A^*, P_B^*} =3/4 [Ambainis ‘01]
and the best known lower bound is
max{P_A^*, P_B^*} geq 1/sqrt{2} [Kitaev ‘03]
We improve the bound to max{P_A^*, P_B^*} geq 0.739 and build QBC protocols that get arbitrarily close to this bound.
The proof uses Mochon’s weak coin flipping protocol.

Gilles Brassard, Merkle Puzzles in a quantum world. Joint work with Peter Hoyer, Kassem Kalach, Marc Kaplan, Sophie Laplante, Louis Salvail, based on arXiv:1108.2316

Ralph Merkle was the inventor of Public Key Crypto in 1974 before Diffie and Hellman (but after the classified discovery). Gilles began with a great quote from Merkle:

The human mind treats a new idea the same way as the human body treats a new protein.

See Merkle’s web site for the amazing story of how he proposed public-key cryptography as a class project in grad school, and the professor described it as a “muddled terribly”, and then he submitted it to JACM, which rejected it because it was “outside of the mainstream of cryptographic thinking.”
Key Distribution problem as imagined by Merkle:
Alice and Bob talk over an authenticated public channel. After doing so Alice and Bob should get a shared secret. Can only be computationally secure, but back then it was not even thought to be computationally possible.
Make the eavesdropper’s effort grow as fast as possible compared to Alice and Bob’s effort
We will measure effort by query complexity to a random oracle. Use that model because it allows to prove a lower bound.
Merkle’s 1974 idea:
Choose a hard-to-invert, but injective, function f that acts on [N^2].
Alice chooses N random values of x and computes the values of f(x).
Alice sends all these values to Bob.
Bob wants to invert one of them.
He keeps trying his own random x’s until he gets a collision; call it s.
Bob sends f(s) to Alice and Alice finds secret s.
So after N queries by Alice and N by Bob they get the same s
Eavesdropper has seen only N useless f(x) values and one useful f(s) which he has to try N^2/2 queries to find s, compared to N for Alice and Bob
Ok, so this is only a quadratic separation, but we don’t actually know that DIffie-Hellman is any better. The lower bound was proven by Narak and Mahmoody 08.
What about the quantum world, where the eavesdropper is quantum? Alice and Bob could be quantum but don’t need to be. The channel remains purely classical, so A&B can’t do QKD
If Eve is quantum, and Alice and Bob are classical, then Eve can find the key in O(N) time, which is the same amount of effort that Alice and Bob expend (albeit quantum and not classical). Unfortunately, this breaks the Merkle scheme completely.
It can it be repaired by allowing A&B to be quantum, or even keeping them classical
One option is to keep Alice classical, use a space of size N^3, and allow Bob to use BBHT to find one preimage in O(N) quantum queries. Eve using Grover needs $N^{3/2}$ so Alice and Bob get an advantage over Eve.
An improved protocol is for Bob to find two preimages in O(N) quantum queries.
Bob sends Alice the bitwise XOR of f(s) and f(s’). Given this Alice uses a table to find the secret.
This gives effort O(N^{5/3}) for Eve and O(N) for Alice and Bob. (It’s not trivial to show Eve can do this well, and even less to show this is optimal)
The proof involves something similar to Ambainis’ element distinctness algorithm with a quantum walk on the Johnson graph. A crucial observation relates to the compositon of our variant of element distinctness on N elements with searching each function value in a set of size N^2. This involves a new composition theorem for non-boolean functions.
What if Bob is classical?
Like before, there are two functions f and t, each with N^2 points.
Bob finds two preimages as in Merkle. The rest is the same: bitwise XOR.
How much time does quantum eavesdropper need? N^{7/6}.
Also, there is a sequence of protocols for which best attack we can find tends to N^2 in the limit. See paper for details. 🙂

Salman Beigi,
Simplified instantaneous non-local quantum computation with applications to position-based cryptography, joint work with Robert Koenig.

Position based crypto (ie authentication of position) by distance bounding techniques
Challenge response protocol
V1 ———-P—————–V2
Verifies send challenges timed for synch arrival at r0, position of verifier
honest prover computes and sends answers
Challenge/response susceptible to cheating by cheating provers each copying and sending the challenge. This shows why classical protocols are insecure.
Kent Munro Spiller 2011 with in- and security proofs
Cheating strategy uses teleportation
In general a quantum protocol is a bipartite unitary.
An attack is a combination of local operations, shared entanglement and one round of communication sufficient to simulate it.
Vaidman 03 achievse this with double exponential cost in ebits for recursive teleportation
Buhrman et al 09 key insights:
Simplified protocol uses O(2^n) ebits
Uses port-based teleportation, due to Ishizaka and Hiroshima (PRL 2009 2009)
Share K maxly entangled states
Highly complicated measurement.
Recovery is very simple. Alice discards all but one of her halves of entangled states
New protocol for instantaneous measurementt
Position based crypto: cheating landscape
Vaidman -doubly exponential can always cheat
Port-based single exponential, suff for cheating
If restricted to classical communication and only linear entanglement can get exponential soundness for Position Based Crypto
If cheaters are allowed q commun and linear entanglement can get only constant

Florian Speelman, Garden Hose Game and Application to Position-Based Quantum Cryptography. Joint work with Harry Buhrman, Serge Fehr, Christian Schaffner, based on arXiv:1109.2563.

Position verification applications:

  • launching nuclear missiles (against enemies not equipped with FTL neutrinos, of course)
  • preventing prank calls of ambulances, SWAT teams or pizzas
  • communicating with right wifi router
  • computing international roaming fees properly
  • for when you want to know you are talking to S. Korea and not N. Korea
  • proving that “live”-blogging is actually occurring from the conference location, and is not just the result of typing out our previous opinions of the results.

On the positive side, can show security if promised there is no shared entanglement,
Show a specific class of schemes to obtain a tradeoff. Increased classical communication for honest players
Example: classically secure but broken by teleportation
Verifier 0 sends psi to Prover
Verifier 1 sends bit to prover
Prover routes psi left or right according to bit
Instead of one bit use a function
V0 sends state and n-bit string x to prover
V1 sends an n bit string y to prover
Prover computes fn f(x,y) and sends psi to V0 or V1 depending on outcome
Nice thing is that it’s no harder for honest prover
But for cheating provers, more entanglement is needed. Can do it with 2times 2^n ebits
Garden hose model
A discrete way of looking at attacks of this sort
Alice and Bob share s pipes between them. Alice has a water tap
Alice use hoses like Enigma plug cords to connect their ends ends of hoses in some pattern
f(x,y) is whether water comes out on left or right
Gardenhose complexity of a function is number of pipes needed to compute the fn
Every piece of hose is a teleportation
Can use GH model to prove upper bounds
Equality 2n+O(log n)
Inner Product 4n
Majority n^2
If f is in logspace, then GH(f) is poly(n).
But there do exist fns with exponential GH complexity.
Can prove lower bounds for explicit functions, but best we can prove are linear.

Poster Session 1

The poster session had tons of great posters that we will not remotely do justice. Instead, here are some photos from a few of them. Try to guess which ones. 🙂


QIP travel funding for American students and postdocs

Want to go to the big quantum conference of the year, but are short of funds? Then
apply here by tomorrow (Nov 1, Seattle time) to have up to $675 of travel costs covered. (Hint: I’m mentioning this because not many people have applied so far.)

Required: Since the money comes from the NSF, you need to be a student or postdoc at a US institution. Your citizenship makes no difference.
Criteria: While lots of people are eligible, preference will be given for students, people presenting papers, and according to other criteria listed on the website. Also, if you apply, please have your advisor say something concrete about your funding situation other than “funds are tight.”

Codes, Geometry and Random structures: Day 3

Codes, Geometry and Random Structures

Pranab Sen,
Johnson-Lindenstrauss dimension reduction using unitary k-designs

Since the abstract isn’t on the website, I’ve reproduced it here:
Abstract:

The celebrated Johnson-Lindenstrauss lemma tells us that a random projection onto k-dimensional Euclidean space almost preserves the ell_2-length of any set of 2^{Theta(k)} vectors with high probability. The lemma has no dependence on the dimension of the ambient space in which the vectors originally lie. One way to implement a random projection is to take a Haar-random unitary operator on the ambient space, apply it to a vector and then project the resulting vector onto its first k coordinates. We show that a unitary poly(k)-design can be used for this purpose instead of a Haar-random unitary. This allows us to perform the dimension reduction quantumly in time polylogarithmic in the dimension of the original space and the number of vectors whose length is to be preserved.
We give two proofs of this result. The first proof digs into the Chernoff-style tail bound of a chi-square distribution, and the second proof is by a finer analysis of the so-called k-moment method originated by Bellare and Rompel and applied previously to unitary designs by Low.
Finally, we present an application of our result to private information retrieval where the sets stored have small size.

Here’s the protocol: Start with a state |psirangleinmathbb{C}^{d_1}. In some cases, it is helpful to imagine that it is drawn from a known set of K states, but this is not necessary for all the applications.
Apply a random unitary from a t-design, divide into d_1/d_2 blocks, each of size d_2, and measure which block the resulting state is in. This is meant to mimic a classical J-L transform which would send d_1 dimensions to d_2 dimensions. It turns out that t= O(d_2) is necessary to get good enough concentration. (And to figure out what the right moments of the underlying Haar distribution are, it turns out that using Levy’s Lemma as Richard Low does is not enough: Pranab needs some stronger dimension-independent concentration to avoid paying another d_1 cost.)
Patrick asked whether this would give a good dimension-reduction scheme if simulated classically. But using e.g. random circuits would require poly(t) gates for a total (classical) cost of d_1 {rm poly}(d_2), whereas even a trivial Gaussian rectangular matrix only takes time d_1d_2 to apply (and more sophisticated methods require time more like d_1log(d_1)).
One application (that Pranab called “toy” because of the parameters achieved) is to private information retrieval. The scenario is as follows:

  • Alice stores a subset S subseteq [m], |S|<n, n ll frac{log m}{loglog m}.
  • Bob has x in [m], wants to know if x in S.
    One solution is for Alice to send her entire database at cost O(n log m). Can achieve O(n (log n + log log m)) using a set system due to Buhrman et al. But there is an Omega(n) lower bound if Bob doesn’t speak.

The quantum solution is to use J-L. Let |Srangle = frac{1}{sqrt{|S|}} sum_{xin S} |xrangle. If we have a compressed version of |Srangle, and want to determine whether langle x | Srangle is zero or geq 1/sqrt{n}. This level of accuracy requires O(n log m) dimensions.
Here is the quantum protocol.

  1. Bob makes n^2 projections of x.
  2. He sends Theta(n^2) block names.
  3. Alice projects |Srangle onto these blocks and sends the resulting states.
  4. The total communication is $latex O(n^2(log n + log log m))

A recurrent difficulty is the problem that the dimension-reduction procedure requires a “which block” measurement, which gives a random answer. This means that two parties cannot cheaply compare a state by applying this J-L procedure and a swap test, unless they get very lucky and happen to land in the same block. Indeed, this phenomenon in which it is easy to sample from a distribution, but presumed hard to reproduce any individual state, even given its label, is related to the conjectured security of some quantum money schemes. This difficulty was also explored in the related paper I wrote with Ashley Montanaro and Tony Short: 1012.2262. There we found mostly negative results about the difficulty of faithfully compressing quantum state J-L-style; Pranab circumvents some of them by including a large classical register specifying the “which block” information, but for many applications this need to condition on the classical register is deadly.

Beth Ruskai, Some old and new results on quantum marginals and reduced density matrices

The quantum marginal problem concerns the m-partite reduced density matrices of an N-body quantum system. Given set of reduced density matrices, the N-representability problem asks whether there exists a N-party density matrix consistent with them. If we could decide this efficiently, then we could solve the QMA-complete local Hamiltonian problem efficiently. Despite this, a constant factor approximation, since the reduction only is known to work for 1/poly(N) accuracy.
Often we restrict to the case of N fermions. In this case, anti-symmetry plays a key role, for example, for m=1, rho_1 is N-representable iff all eigenvalues are leq 1/N. (This is essentially the Pauli exclusion principle.) I said “for example,” but really for m=2, it’s already not fully understood, and it is known that the full criteria do not depend only on eigenvalues.
There are a number of nice technical results, which I did not fully follow in part because I was typing Pranab’s talk still. (Yes, I am aware of the irony.)
For example, if R_m denotes the set of m-partite marginal states, then we know that {rho_N : rho_N mapsto R_m {rm extreme}} increases with m.
Many important results were proven in a 1972 paper of Erdahl, although one of the key proofs (that every extreme point is exposed) turns out to have a bug in it.
One useful theorem is that for 2m geq N, the preimage of an extreme point is unique. This implies that
The intuition behind the theorem is that if a point has a nonunique preimage, then by a suitable averaging over a phase, we can show that this point is a mixture of two different valid m-partite density matrices.
Intuition behind thm:
m-body density matrix
rho_J = sum_k mu_k^2 |chi_jranglelangle chi_j|
Purify
|psirangle = sum_j e^{iomega_j} mu_j |chi_jrangle |phi_jrangle
If the $omega_j$ are not unique, then we can write
|psirangle= x_1 |psi_1rangle + e^{iomega} x_2 |psi_2rangle,
where any $omega$ gives the same marginal $rho_J$.
In this case, we can average over $omega$ and get the mixture
x_1^2 |psi_1ranglelangle psi_1|+ x_2^2 |psi_2ranglelangle psi_2|, implying that $rho_J$ is not extreme.
In general can we say that the pre-image of an exposed point is unique? This open question dates back at least to Erdahl’s paper, and the answer is no: QECCs serve as a counter-example. The proof is in 1010.2717. I think this is a nice example of how quantum information tools with operational goals (e.g. fighting decoherence) turn out to have interesting foundational interpretations.

Omar Fawzi, Almost Euclidean sections of L1 and quantum uncertainty relations

Metric uncertainty relation:
Here is the main idea: given a state |psirangle apply U_k where k is drawn randomly from {1,…,t}. (Here t is much smaller than the dimension of the state.) Then we trace out a small subsystem B, leaving a larger quantum state A, say with n qubits. If K is the “key” state, then we hope that the state on AK is approximately maximally mixed.
It turns out that what we need here is a low distortion embedding of ell_2 into ell_1(ell_2) embedding. That is, if our state is mapped to sum_{k,a,b} alpha_{k,a,b} |k,a,brangle then we would like the following inequality to hold:
sqrt{dim AK} (1-epsilon) leq sum_{k,a} sqrt{sum_b |alpha_{k,a,b}|^2}leq sqrt{dim AK}
The ell_2 system is the one we discard, and we would like it to be small.
What can we achieve?
For random unitaries, we get t = O(log(1/epsilon)/epsilon^2) and $dim B = O(1/epsilon^2)$.
But what we can achieve explicitly (i.e. efficiently on a quantum computer)? When t=2, MUBs guarantee that the state of AK has entropy of at least n/2 (due to entropic uncertainty relations), but not high enough to guarantee high fidelity with the maximally mixed state. Nor does it work to take more MUBs.
Unitary 2-designs work, but are very costly.
The construction is a variant of one proposed by Indyk. The idea is to first apply a MUB (using O(1) random bits), which guarantees creating entropy at least n/2, and then a permutation extractor, which uses O(log n) bits to extract a constant fraction of this entropy. We put these extracted bits to the side and continue, each time multiplying the number of remaining qubits by a constant strictly less than one (something like 7/8). Eventually we are left with log(n) qubits that we just refer to as the ell_2 part; i.e. the B system.
This has a number of nice consequences. The relations between these different consequences are also nice, and in my opinion, underappreciated. Unfortunately, I won’t do them justice here, but here is a brief sketch:

  • An almost-Euclidean subspace of ell_1(ell_2).
  • Metric uncertainty relations based on explicit unitaries
  • The first construction of efficient locking
  • Efficient quantum identification codes using n classical bits and O(log^2(n/epsilon)) qubits.

Unfortunately, I had to catch an early flight, and missed the rest of the workshop. It’s a pity, since the talks seemed like they’d be pretty exciting.

Codes, Geometry and Random Structures: Day 2

Codes, Geometry and Random Structures

Graeme Smith, Detecting incapacity, based on 1108.1807

A central question in quantum information theory is determining when a channel has nonzero quantum capacity. Pretty much all we know here is that there are a few criteria for proving a channel has zero quantum capacity: PPT channels can’t transmit entanglement (since LOCC can’t change PPT states to non-PPT states) nor can anti-degradable channels (because of no-cloning). These two arguments appear to both be pretty specific. Can we put them on the same footing, and hopefully also derive some new criteria?
That’s what this paper does. The talk was good, but the paper also describes the results clearly.
Here is a sample of the results.
Assume that R is unphysical on set S; e.g. S is the set of quantum states and R is the transpose map. Suppose that for any physical map D, there exists a physical map D^* with Rcirc D = D^* circ R. If R is the partial transpose then D^* is simply the operation you get from taking the complex conjugate of all the Kraus operators.
Their theorem then states that if Rcirc N is physical, N cannot transmit the states in S. The proof is a simple exercise in composition.
In this case we say that N “physicalizes” R or, equivalently, R “incapacitates” N.
This is not quite enough to get the no-cloning criterion, but a mild generalization will do the job. Graeme gave a nice explanation in terms of how teleporting information involves going backwards in time through the EPR pairs, and if PPT states are used, then the time-traveling information gets confused and doesn’t know whether to get forwards or backwards. However, if this principle is implicit in the proof, then it’s very implicit.

Jean-Pierre Tillich, Quantum turbo codes with unbounded minimum distance and excellent error-reducing performance

LDPC codes are successful classically, but quantumly they suffer many drawbacks: their distance isn’t so good (but see 0903.0566), their Tanner graphs have short cycles (specifically 4-cycles), and iterative decoding doesn’t work. One particular no-go theorem is that there are no convolutional encoder that is both recursive and non-catastrophic (0712.2888).
In this talk, Tillich discusses a catastrophic and recursive encoder. It achieves rate 1/8 and somewhat surprisingly, it achieves minimum distance of Omega(log(n) / loglog(n)) with high probability. He conjectures that this should be Omega(log n) for the right choice of interleaver.
The resulting code can be thought of not so much as “error-correcting” but “error-reducing.” Error rate p=0.14 becomes 10-3, and p=0.105 becomes 10-4. This compares favorably with the toric code threshold of p=0.15. He suspects that the limitation here comes from the iterative decoder.

Jurg Wullschleger, The decoupling theorem

The decoupling theorem is arguably the Book proof of most quantum coding theorems. The encoder applies a random unitary (in some problem-dependent way) and transmits part of the output to the receiver. Treat this part as being traced out, and if she keeps part, then consider it to be controlled by Eve. If the resulting state has the reference system “decoupled” from Eve, then since the remaining parts of the state (controlled by Bob) purify everything, then a local unitary on Bob’s side can give him pure entangled states with both the reference, and separately with Eve. This allows the complicated task of transmitting quantum information reliably (which is hard enough that proving that the coherent information was the quantum capacity originally took a lot of difficult technical work) can be reduced to the simpler goal of destroying correlations.
Decoupling theorems were originally developed for the state-merging problem, by Horodecki-Oppenheim-Winter ’05 (where it was “Lemma 5” or something similarly marginal). Then it was further developed by quant-ph/0606225, where it was called a Theorem. Then in , it moved to the title. So it took some time for the community to fully appreciate how useful this tool is.
Some of these tools use smoothed min- and max-entropies, which can be thought of as one-shot variants of von Neumann entropy that are either pessimistic or optimistic, depending on application. Amusingly, the smoothed max-entropy is not defined by taking a smoothed rank, but is defined in order to satisfy a relation that we’d like (which also holds for pure states). This is reminiscent of the speed of light, which is an integer number of meters/second by definition.
For any pure state rho^{XYE}, define the smoothed max entropy to be
H_{max}^epsilon(X|Y))_rho = - H_{min}^epsilon H(X|E)_rho.
Other definitions are also used, but are pretty close.
In this talk, Jurg described a converse theorem to the decoupling theorem, and explained many scenarios in which it applies. See his paper for the details.

Frédéric Dupuis , Classical coding via decoupling

Decoupling is great for sending quantum messages, and gives simpler proofs than even the original HSW proof of the classical capacity (or any other known proofs). Thus it’s interesting to find a decoupling-based proof of the HSW theorem not only for the sake of the unification of knowledge, but also so we can get a simpler proof. This is essentially what is achieved here, although only when the inputs are uniformly distributed.

Mark Wilde, Polar codes for classical, private, and quantum communication

We are really good at dealing with correlated information, with tools like Slepian-Wolf, side-information-assisted hypothesis testing and multiple-access channel codes. So we can treat inputs to channels in this way. We can perform simple mathbb{F}_2-linear maps on the inputs so that we can manipulate n binary channels each with capacity C become roughly nC channels with capacity roughly 1, and n(1-C) channels roughly with capacity 0.
Quantumly, much of this goes through, but we need the ability to simultaneously apply typical projections. This key lemma is Lemma 3 in Pranab Sen’s paper arXiv:1109.0802.

1 - {rm tr} Pi_N cdots Pi_1 rho Pi_1 cdots Pi_N leq 2 sqrt{sum_{i=1}^N {rm tr} (I-Pi_i)rho} .

Mark calls this a “noncommutative union bound.” Note that using a Winter-style “gentle measurement lemma” puts the square root inside the sum, which for this application cuts the achievable rate in half.
For private communication, we can polarize into channels of four types: either good for Bob and good for Eve, bad for both, or good for one and bad for the other. We send random bits into the channels that are good for both, and arbitrary bits into the ones that are bad for both. Information bits go into the ones that are good for Bob and bad for Eve, and shared secret key goes into the ones that are bad for Bob (so he effectively gets the message anyway), and good for Eve.
This generalizes to quantum communication using Devetak’s technique from quant-ph/0304127. Channel inputs are

Bob

Eve

input

Good

Good

|+rangle

Good

Bad

information

Bad

Good

shared entanglement

Bad

Bad

arbitrary

A big open question is to make the decoding efficient, and to figure out which channels are good.

Joseph M. Renes, Quantum information processing as classical processing of complementary information

The main theme is a CSS-like one: we can often treat quantum information as being classical information about two complementary observables.
For example, if you coherently measure in both the X and Z basis, then you’ve effectively done a swap with the qubit you’re measuring.
This principle is related to uncertainty principles
H(X^A|B) + H(Z^A|C) geq log 1/c
H(X^A|B) + H(Z^A|B) geq log 1/c + H(A|B)
Here c is the largest overlap between eigenvector of X and Z operators.
Rearranging the second inequality, we see
H(A|B) leq  H(X^A|B) + H(Z^A|B)  - log d.
Thus, entanglement between A and B corresponds to the ability to predict complementary observables.
Many quantum information protocols can be understood in this way; e.g. entanglement distillation can be thought of as Alice and Bob having some correlated amplitude and phase information, and trying to do information reconciliation on these quantities.
Joe also talked about quantum polar codes, which create “synthetic” channels with suitable uses of CNOTs on the inputs. The idea is that CNOTs act on Z information in one direction and X information in the opposite direction. And a decoder need only separately figure out amplitude and phase information. There are subtleties: this information can be correlated, and it can exist on the same qubits. When amplitude and phase information is found on the same qubit, we use an entanglement assistance.
This gives efficient decoding for Pauli channels and erasure channels. And in numerical experiments, the entanglement assistance appears to be often unnecessary.

Codes, Geometry and Random Structures: Day 1

I now appreciate the difficulty of taking notes in real time! Here is my “liveblogging” of the first day.

Codes, Geometry and Random Structures

The first talk is by Fernando Brandao, who’s talking about our joint paper (also with Michal Horodecki) titled Random quantum circuits are approximate polynomial-designs. “Random quantum circuits” means choosing poly(n) random two-qudit gates between nearest neighbors of a line of n qudits. (The hardest case is qubits, since increasing the local dimension increases the randomizing power of the local gates.) An (approximate) t-design is a distribution over U(dn) that has (approximately) the same first t moments as the Haar measure on U(dn). (Technically by tth moment, we mean polynomials of degree t in entries of U and degree t in U*.)
Exact t-designs are finicky combinatorial objects, and we only know how to construct them efficiently when t is 1 or 2 (Paulis are 1-designs and Cliffords are 2-designs). But for a long time, the only approximate t-designs we could construct were also only for t=1 or 2, and the only progress was to reduce the polynomial cost of these designs, or to connect them with plausible natural models of random circuits. In the last few years, the three of us (together with Richard Low), found a construction of efficient t-designs on n qubits for tleq O(n/log n), and found that polynomial-size random circuits give approximate 3-designs.
So how do we get t up to poly(n) in our current paper? There are four technical ingredients.

  1. As with classical random walks, it’s useful to think about quantum random circuits in terms of the spectral gaps of certain Hermitian matrices. The matrices we consider have dimension d2nt, and we hope to show that their spectral gap is at least 1/poly(n,t). For more on this, see my earlier work (with Matt Hastings) on tensor product expanders, or the Hamiltonian-based formalism of Znidaric and Brown-Viola
  2. Using a version of path coupling for the unitary group due to Oliveira, we can show that random circuits of exponential length (i.e. poly(dn) gates) are t-designs for all t. In other words, the resulting distribution over the unitary group is approximately uniform in whatever natural distance measure you like (for us, we use Wasserstein (earthmover) distance). This is what we intuitively expect, since constructing an arbitrary unitary requires poly(dn)gates, so one might guess that applying a similar number of random gates would give something approximately uniform.
  3. This means that random circuits on O(log n) qudits are rapidly mixing, which translates into a statement about the gaps of some corresponding Hamiltonians. We would like to extend this to a statement about the gaps for n qudits. This can be achieved by a theorem of Nachtergaele.
  4. For this theorem to apply, we need the certain projectors to approximately commute. This involves a technical calculation of which the key idea is that the t! permutations of t D-dimensional systems are approximately orthogonal (according to the Hilbert-Schmidt inner product) when t ll sqrt{D}. Here t comes from the number of moments we are trying to control (i.e. we want a t-design) and D is the dimension of the smaller block that we know we have good convergence on. In this case, the block has O(log n) qudits, so D = poly(n). If we choose the constant in the O(log n) right, then D will dominate t and the overall circuit will be a t-design.

Whenever I talk about t designs, I get a lot of skeptical questions about applications. One that I think is natural is that quantum circuits of size nk given access to a black box unitary U can’t tell whether U was drawn from the Haar measure on the full unitary group, or from a
nO(k) design. (This is described in our paper.) The proof of this is based on another general application, which is that t designs give concentration of measure, similar to what you get from uniformly random unitaries, but with the probability of a quantity being far from its expectation decreasing only as D^{-t}, where D is the dimension of the system.
Next, Ashley Montanaro spoke about


A quantum generalisation of Fourier analysis on the boolean cube

based mostly on his nice paper on this topic with Tobias Osborne.
In theoretical computer science, Fourier analysis of boolean functions is a powerful tool. One good place to learn about this are these lecture notes of Ryan O’Donnell. There are also good surveys by Punya Biswal and Ronald De Wolf. The main idea is that a function f from {-1,1}^n to mathbb{R} can be expanded in the Fourier basis for mathbb{Z}_2^n. This is equivalent to expressing f as a multilinear form, or in quantum language, we might think of f as a 2n-dimensional vector and apply H^{otimes n}. If f is a multilinear function, then it has a degree, given by the size of the largest monomial with a nonzero coefficient. Alternatively, we can ask for the maximum Hamming weight of any state that has nonzero amplitude after we apply H^{otimes n}.
Here is a small sample of the nice things we know classically.

  • KKL Theorem: Any boolean function f has some j for which I_j(f) = Omega({rm Var}(f)log(n)/n).
    (Spoiler alert: no quantum analogue is known, but proving one is a great open problem.)
  • Hypercontractive bounds: Define a noise operator D_rho that flips each bit with probability frac{1-rho}{2}. Define the p-norm
    |f|_p = left( frac{1}{2^n} sum_{xin{-1,1}^n} |f(x)|^pright)^{1/p}
    Then the hypercontractive inequality states that
    |D_rho(f)|_q leq |f|_p
    if 1 leq p leq q and rho leq sqrt{frac{p-1}{q-1}}
  • One application is that degree-d functions satisfy
    |f|_q leq (q-1)^{d/2} |f|_2.

What about quantum versions?
Boolean functions are replaced by Hermitian matrices of dimension 2^n. The Fourier expansion is replaced by a Pauli expansion. The noise operator is replaced by a depolarizing channel acting on every qubit.
With these replacements, a hypercontractive inequality can still be proven, albeit with the restriction that pleq 2leq q. The classical argument does not entirely go through, since at one point it assumes that prightarrow q norms are multiplicative, which is true for matrices but not superoperators . Instead they use results by King that appear to be specialized to qubits.
As a result, there are some nice applications to k-local Hamiltonians. For example, if |H|_2=1 and tgeq (2e)^{k/2}, then the fraction of eigenvalues of H greater than t is less than e^{-kt^{2/k}/2e}. And if H is nonzero then it has rank geq 2^n e^{-2k}.
The FKN theorem also carries over, implying that if the weight of Fourier coefficients on (2+)-local terms is no more than epsilon then the operator must be O(epsilon)-close to a single-qubit operator in 2-norm distance.
There are a number of great open questions raised in this work. Ashley didn’t mention this, but one of those open questions led to our joint work arXiv:1001.0017, which has been a lot of fun to work on. A quantum KKL theorem is another one. Here is another. Suppose H^2=I, and that H acts nontrivially on each qubit. Then does it hold that the degree of H is always at least Omega(log n)?
<Interlude>
Live-blogging is hard work! Even semi-live blogging is.
You’ll notice the level of details of my reports diminishes over time.
The speakers were all excellent; it’s only your reporter that started to fade.
</interlude>

Marius Junge, Exponential rates via Banach space tools

The key quantum information theory question addressed today is:
Why C_E = 2Q_E? And no, the answer is not “super-dense coding and teleportation.” (At least not in this talk.) Note that the classic quantum papers on entanglement-assisted channel coding are quant-ph/0106052 and quant-ph/0106075.
First, Marius gave a nice review of classical information theory. In the spirit of Shannon, I will not repeat it here.
Then he reinterpreted it all in operator algebra language! For example, classical capacity can be interpreted as a commutative diagram!
begin{array}{ccc} L_1(A^{otimes n}) & overset{{{cal N}^{otimes n}}}{longrightarrow} & L_1(B^{otimes n}) \ uparrow {cal E} & &downarrow {cal D} \ ell_1^N & overset {approx {rm Id}}{longrightarrow} & ell_1^N end{array}
For the quantum capacity, we simply replace ell_1^N with L_1(M_d).
To mix quantum and classical, we can define C = M_{n_1} oplus cdots oplus M_{n_k} in the spirit of quant-ph/0203105.
The resulting commuting diagram is:
begin{array}{ccc}L_1(A^{otimes n}) &overset{{{cal N}^{otimes n}}}{longrightarrow}& L_1(B^{otimes n}) \ uparrow ({rm id} otimes {cal E}) & &downarrow {cal D} \ L_1(C) & overset {approx {rm Id}}{longrightarrow} & L_1(C) end{array}
There is also an entanglement-assisted version that I won’t write down, but hopefully you can imagine.
Next, he introduced p-summing operators. The underlying principle is the following.

In a finite-dimensional Banach space, every unconditionally convergent sequence (i.e. converges even if arbitrarily permuted) is absolutely summing. But in general, this is not the case.

More formally,
T:X rightarrow Y is p-summing if
left ( sum_k |T x_k|_Y^p right)^{frac{1}{p}} leq pi_p(T)sup_{|(alpha_k)|_{p'}leq 1} left| sum_k alpha_k x_k right|
where 1/p + 1/p' = 1.
pi_p(T) is the optimal constant possible in this expression.
e.g. if p=1,
sum_k |T x_k| leq pi_1(T) sup_{epsilon_k = pm 1} |sum_k epsilon_k x_k|.
Why is this nice?
One use is the following factorization theorem due to Grothendieck and Pietch.

If T: ell_infty^m rightarrow X is absolutely p-summing, then there exists a probability distribution lambda such that
|T(x)|_X leq pi_p(T) left(sum_{i=1}^m lambda_i |x_i|_p right)^{1/p}

Here is the connection to (classical) information theory. A noisy channel can be written as
Phi: ell_1^m rightarrow ell_1^m. The dual map is Phi^*: ell_infty^m rightarrow ell_infty^m. And the capacity is given by this amazing formula!

C(Phi) = lim_{prightarrowinfty} p ln pi_p(Phi^*)

The following innocuous observation will have powerful consequences: pi_p(T^{otimes n}) = pi_p(T)^n.
One consequence is a strong converse: Let alpha > C(Phi), N geq e^{alpha n}.
Then P_{rm succ} leq e^{-epsilon n}.
To prove this we also need the fact that pi_1(S: ell_infty^N rightarrow X) leq N^{1-1/p} pi_p(S), which implies that the success probability is
leq pi_p(T)^n / N^{1/p} leq e^{alpha n /p} / N^{1/p}.
Next, Marius talked about how their formalism applies to the
entanglement-assisted capacity of quantum channels. Again there is a simple formula

C_E({cal N}) = lim_{prightarrow infty} pi_p^o ({cal N}^*)

What is pi_p^o?
Let {cal N}^* map from M_m to M_k. Then pi_p^o(T) = inf |a|_p |||S|||, where the inf is over all a, S satisfying T(x) = S(a^* x a).
There is another expression also for limited entanglement assistance, which was considered operationally in quant-ph/0402129.
Ultimately, there is an answer for the question at the start of the talk. The classical capacity is twice as big because dim M_d = d^2 and dim ell_1^d = d. Obviously! 🙂
There is also the promise of an intriguing new additivity violation in the limited-entanglement setting, although I admit that the exact details eluded me.

Yi-Kai Liu,
Universal low-rank matrix recovery from Pauli measurements, based on 1103.2816

Previous work on compressed tomography established that
for any rank-r density matrix rho, O(rd log^2 d) random Paulis suffice to reconstruct rho with high probability.
This work establishes that O(r d log^6(d)) random Paulis work simultaneously for all rho. This also gives better error bounds for noisy behavior.
As a result, one can obtain bounds on the sample complexity, instead of just the number of measurement settings.
The technical statement that we need is that random Paulis satisfy the following restricted isometry property (RIP):

For all X with rank leq r,
(1-delta) |X|_{S_2} leq |R(X)|_{ell_2} leq (1+delta) |X|_2

Gaussian matrices are known to have this property [Recht, Fazel, Parillo 2007].
More concretely, let Omega be a random set of m = O(r d log^6 d) random Pauli matrices.
Define R(rho) = ({rm tr} Prho)_{Pin Omega}.
Measurements produce bapprox R(X)
The reconstruction algorithm is to solve:

{rm argmin}_{Xgeq 0} |R(X)-b|_2^2 + mu {rm tr} |X|

Why does this work?
The set of X with {rm tr} X leq 1 is a ball, and low-rank states are on exposed.
So when we intersect with some generic hyperplane R(X)=b, we’re likely to have a unique solution.
More formally, let rho be the true state and S = X-rho. Note that R(S)=0. We want to show S=0. Decompose S = S_0 + S_c, where S_0 has rank leq 2r and S_c has no overlap with row or column spaces of rho.
If X has minimum trace, then |S_c|_1 leq |S_0|_1.
Then we can use RIP to show that S_0 and S_c are both small, using a clever telescoping technique due originally to Candes and Recht.
Ok, so how do we prove the RIP? The idea is that R should be well conditioned, and be “incoherent” so that it’s operator norm is much less than its 2-norm.
Recht et al ’07 used a union bound over (a net for) rank-r matrices. This works because Gaussians have great concentration. But Paulis are pricklier.
This work: Use generic chaining (a la Rudelson and Vershynin). This requires proving bounds on covering numbers, which will be done using entropy duality (c.f. Guedon et al 2008).
Here’s a little more detail. If T is a self-adjoint linear map from M_d to M_d, then
define |T|_{(r)} = sup {|tr X^dag T(X)| : X in U}, where
U = {X in M_d : |X|_2leq 1, {rm rank}(X) leq r}
The goal is to show |R*R-I|_{(r)} leq 2 delta - delta^2, where delta comes from the RIP condition.
The main tool is Dudley’s inequality:
mathbb{E}[sup G(X) : X in U] leq {rm const} int {rm d}epsilon sqrt{log(N(U,d_G, epsilon))}
Here G is a Gaussian process with d_G(X,Y) = sqrt{mathbb{E}((G(X)-G(Y))^2)} and N(U,d_G,epsilon) is the # of radius-epsilon balls in metric d_G needed to cover U.
We can upper bound N using the trace norm. Let B_1 denote the trace-norm ball.
Define |M|_X = max_{P in Omega} |tr P^dag M|.
There are two estimates of N. The easy one is that
N(B_1, |cdot|_X, epsilon) leq poly(1/epsilon) exp(d^2)
The harder one is that
N(B_1, |cdot|_X, epsilon) leq exp(log^2(d) / epsilon^2).
This is obtained using entropy duality, with arguments that are somewhat specific to the spaces in question, using techniques of Maurey. See paper (and references 🙂 ) for details.

Matthias Christandl
Reliable quantum state tomography, based on 1108.5329

The central question:

How do you reconstruct a density matrix, with error bars, from measurements?

One framework is “measure and predict.”
We have n+k systems, and measure n (like a training set) and then predict the results of measurement outcomes from the next k (like a testing set).
The method:
First compute the likelihood function
mu_{B^n}(sigma) = frac{1}{c_{B^n}} {rm tr}[sigma^{otimes n}B^n]
From this we can compute a confidence region, point estimates, error bars, and what have you.
How to compute a 1-epsilon confidence region?
Let epsilon' = epsilon/poly(n), and let Gamma have measure 1-epsilon'. Then add an extra distance delta = sqrt{(2/n) (ln 2/epsilon + O(ln n))} to obtain Gamma^delta.
Then the main result is the state is in Gamma^delta with probability geq 1-epsilon. Crucially the probability is taken over measurement outcomes, and it is important to realize that the outputted confidence interval is itself a random variable. So one cannot say that conditioned on the measurement outcome, we have a high probability of the state being in the confidence region. Without a prior distribution over states, such statements are impossible.
A key lemma in the proof (due to 0806.1091 and applied to QKD in 0809.3019) is that for any sigma, sigma^{otimes n}leq n^{O(d^2)} int {rm d}rho, rho^{otimes n}, where {rm d}rho is the Hilbert-Schmidt measure.

John Preskill, Protected gates for superconducting qubits

This talk was an award lecture for the award that goes by the charmingly Frenglish name of “chaire Aisenstadt chair”. John was introduced by a poem by Patrick!
I enjoyed John’s two opening questions that he imagined the audience would have after seeing the title: “What does this have to do with this conference?” and “Will I understand anything?”
His response was that we shouldn’t worry, since this is a theorist’s conception of superconducting qubits.
Unfortunately, my note-taking quality suffered during this talk, since there was a high density of equations, figures and ideas. So my summary will be breezier. This may become the norm for the remaining days of the conference as well. However, here is an older version of this talk.
As we all know, it’d be great to have a quantum computer. Instead of concatenated FTQC, which has lousy constants, what about physically motivated QECCs? One example is the braiding of nonabelian anyons. Here’s another less studied one. Encoding qubits in harmonic oscillators [“Continuous variable quantum codes”, Gottesman-Kitaev-Preskill 2000].
The rest of the talk was about a particular variant of this idea, called the 0-pi qubit (due to
I have more detailed notes, but they are really rough, and I am saving my energy for the upcoming sessions. In other words, the margin is big enough for the proof, but my glucose level is not.

Why medicine needs scirate.com

Defenders of the traditional publishing model for medicine say that health-related claims need to be vetted by a referee process. But there are heavy costs. In quantum information, one might know the proof of a theorem (e.g. the Quantum Reverse Shannon Theorem) for years without publishing it. But one would rarely publish using data that is itself secret. Unfortunately, this is the norm in public health. It’s ironic that the solution to the 100-year-old Poincaré conjecture was posted on arxiv.org and rapidly verified, while research on fast-moving epidemics like H5N1 (bird flu) is
delayed so that scientists who control grants can establish priority.
All this is old news. But what I hadn’t realized is that the rest of science needs not only arxiv.org, but also scirate.com. Here is a recent and amazing, but disturbingly common, example of scientific fraud. A series of papers were published with seemingly impressive results, huge and expensive clinical trials were planned based on these papers, while other researchers were privately having trouble replicating the results, or even making sense of the plots. But when they raised their concerns, here’s what happened (emphasis added):

In light of all this, the NCI expressed its concern about what was going on to Duke University’s administrators. In October 2009, officials from the university arranged for an external review of the work of Dr Potti and Dr Nevins, and temporarily halted the three trials. The review committee, however, had access only to material supplied by the researchers themselves, and was not presented with either the NCI’s exact concerns or the problems discovered by the team at the Anderson centre. The committee found no problems, and the three trials began enrolling patients again in February 2010.

As with the Schön affair, there were almost comically many lies, including a fake “Rhodes scholarship in Australia” (which you haven’t heard of because it doesn’t exist) on one of the researcher’s CVs. But what if they lied only slightly more cautiously?
By contrast, with scirate.com, refutations of mistaken papers can be quickly crowdsourced. If you know non-quantum scientists, go forth and spread the open-science gospel!