I now appreciate the difficulty of taking notes in real time! Here is my “liveblogging” of the first day.
The first talk is by Fernando Brandao, who’s talking about our joint paper (also with Michal Horodecki) titled Random quantum circuits are approximate polynomial-designs. “Random quantum circuits” means choosing poly(n) random two-qudit gates between nearest neighbors of a line of n qudits. (The hardest case is qubits, since increasing the local dimension increases the randomizing power of the local gates.) An (approximate) t-design is a distribution over U(dn) that has (approximately) the same first t moments as the Haar measure on U(dn). (Technically by tth moment, we mean polynomials of degree t in entries of U and degree t in U*.)
Exact t-designs are finicky combinatorial objects, and we only know how to construct them efficiently when t is 1 or 2 (Paulis are 1-designs and Cliffords are 2-designs). But for a long time, the only approximate t-designs we could construct were also only for t=1 or 2, and the only progress was to reduce the polynomial cost of these designs, or to connect them with plausible natural models of random circuits. In the last few years, the three of us (together with Richard Low), found a construction of efficient t-designs on n qubits for , and found that polynomial-size random circuits give approximate 3-designs.
So how do we get t up to poly(n) in our current paper? There are four technical ingredients.
- As with classical random walks, it’s useful to think about quantum random circuits in terms of the spectral gaps of certain Hermitian matrices. The matrices we consider have dimension d2nt, and we hope to show that their spectral gap is at least 1/poly(n,t). For more on this, see my earlier work (with Matt Hastings) on tensor product expanders, or the Hamiltonian-based formalism of Znidaric and Brown-Viola
- Using a version of path coupling for the unitary group due to Oliveira, we can show that random circuits of exponential length (i.e. poly(dn) gates) are t-designs for all t. In other words, the resulting distribution over the unitary group is approximately uniform in whatever natural distance measure you like (for us, we use Wasserstein (earthmover) distance). This is what we intuitively expect, since constructing an arbitrary unitary requires poly(dn)gates, so one might guess that applying a similar number of random gates would give something approximately uniform.
- This means that random circuits on qudits are rapidly mixing, which translates into a statement about the gaps of some corresponding Hamiltonians. We would like to extend this to a statement about the gaps for n qudits. This can be achieved by a theorem of Nachtergaele.
- For this theorem to apply, we need the certain projectors to approximately commute. This involves a technical calculation of which the key idea is that the t! permutations of t D-dimensional systems are approximately orthogonal (according to the Hilbert-Schmidt inner product) when . Here t comes from the number of moments we are trying to control (i.e. we want a t-design) and D is the dimension of the smaller block that we know we have good convergence on. In this case, the block has O(log n) qudits, so D = poly(n). If we choose the constant in the O(log n) right, then D will dominate t and the overall circuit will be a t-design.
Whenever I talk about t designs, I get a lot of skeptical questions about applications. One that I think is natural is that quantum circuits of size nk given access to a black box unitary U can’t tell whether U was drawn from the Haar measure on the full unitary group, or from a
nO(k) design. (This is described in our paper.) The proof of this is based on another general application, which is that t designs give concentration of measure, similar to what you get from uniformly random unitaries, but with the probability of a quantity being far from its expectation decreasing only as , where D is the dimension of the system.
Next, Ashley Montanaro spoke about
based mostly on his nice paper on this topic with Tobias Osborne.
In theoretical computer science, Fourier analysis of boolean functions is a powerful tool. One good place to learn about this are these lecture notes of Ryan O’Donnell. There are also good surveys by Punya Biswal and Ronald De Wolf. The main idea is that a function f from to can be expanded in the Fourier basis for . This is equivalent to expressing f as a multilinear form, or in quantum language, we might think of f as a 2n-dimensional vector and apply . If f is a multilinear function, then it has a degree, given by the size of the largest monomial with a nonzero coefficient. Alternatively, we can ask for the maximum Hamming weight of any state that has nonzero amplitude after we apply .
Here is a small sample of the nice things we know classically.
- KKL Theorem: Any boolean function f has some j for which .
(Spoiler alert: no quantum analogue is known, but proving one is a great open problem.)
- Hypercontractive bounds: Define a noise operator that flips each bit with probability . Define the p-norm
Then the hypercontractive inequality states that
- One application is that degree-d functions satisfy
What about quantum versions?
Boolean functions are replaced by Hermitian matrices of dimension . The Fourier expansion is replaced by a Pauli expansion. The noise operator is replaced by a depolarizing channel acting on every qubit.
With these replacements, a hypercontractive inequality can still be proven, albeit with the restriction that . The classical argument does not entirely go through, since at one point it assumes that norms are multiplicative, which is true for matrices but not superoperators . Instead they use results by King that appear to be specialized to qubits.
As a result, there are some nice applications to k-local Hamiltonians. For example, if and , then the fraction of eigenvalues of H greater than t is less than . And if H is nonzero then it has rank .
The FKN theorem also carries over, implying that if the weight of Fourier coefficients on (2+)-local terms is no more than then the operator must be -close to a single-qubit operator in 2-norm distance.
There are a number of great open questions raised in this work. Ashley didn’t mention this, but one of those open questions led to our joint work arXiv:1001.0017, which has been a lot of fun to work on. A quantum KKL theorem is another one. Here is another. Suppose , and that H acts nontrivially on each qubit. Then does it hold that the degree of H is always at least ?
Live-blogging is hard work! Even semi-live blogging is.
You’ll notice the level of details of my reports diminishes over time.
The speakers were all excellent; it’s only your reporter that started to fade.
Marius Junge, Exponential rates via Banach space tools
The key quantum information theory question addressed today is:
Why ? And no, the answer is not “super-dense coding and teleportation.” (At least not in this talk.) Note that the classic quantum papers on entanglement-assisted channel coding are quant-ph/0106052 and quant-ph/0106075.
First, Marius gave a nice review of classical information theory. In the spirit of Shannon, I will not repeat it here.
Then he reinterpreted it all in operator algebra language! For example, classical capacity can be interpreted as a commutative diagram!
For the quantum capacity, we simply replace with .
To mix quantum and classical, we can define in the spirit of quant-ph/0203105.
The resulting commuting diagram is:
There is also an entanglement-assisted version that I won’t write down, but hopefully you can imagine.
Next, he introduced p-summing operators. The underlying principle is the following.
In a finite-dimensional Banach space, every unconditionally convergent sequence (i.e. converges even if arbitrarily permuted) is absolutely summing. But in general, this is not the case.
is p-summing if
is the optimal constant possible in this expression.
e.g. if p=1,
Why is this nice?
One use is the following factorization theorem due to Grothendieck and Pietch.
If is absolutely p-summing, then there exists a probability distribution such that
Here is the connection to (classical) information theory. A noisy channel can be written as
. The dual map is . And the capacity is given by this amazing formula!
The following innocuous observation will have powerful consequences: .
One consequence is a strong converse: Let .
To prove this we also need the fact that , which implies that the success probability is
Next, Marius talked about how their formalism applies to the
entanglement-assisted capacity of quantum channels. Again there is a simple formula
What is ?
Let map from to . Then , where the inf is over all a, S satisfying .
There is another expression also for limited entanglement assistance, which was considered operationally in quant-ph/0402129.
Ultimately, there is an answer for the question at the start of the talk. The classical capacity is twice as big because and . Obviously! 🙂
There is also the promise of an intriguing new additivity violation in the limited-entanglement setting, although I admit that the exact details eluded me.
Universal low-rank matrix recovery from Pauli measurements, based on 1103.2816
Previous work on compressed tomography established that
for any rank-r density matrix , random Paulis suffice to reconstruct with high probability.
This work establishes that random Paulis work simultaneously for all . This also gives better error bounds for noisy behavior.
As a result, one can obtain bounds on the sample complexity, instead of just the number of measurement settings.
The technical statement that we need is that random Paulis satisfy the following restricted isometry property (RIP):
For all X with rank ,
Gaussian matrices are known to have this property [Recht, Fazel, Parillo 2007].
More concretely, let be a random set of random Pauli matrices.
The reconstruction algorithm is to solve:
Why does this work?
The set of X with is a ball, and low-rank states are on exposed.
So when we intersect with some generic hyperplane R(X)=b, we’re likely to have a unique solution.
More formally, let be the true state and . Note that R(S)=0. We want to show S=0. Decompose , where has rank and has no overlap with row or column spaces of .
If X has minimum trace, then .
Then we can use RIP to show that and are both small, using a clever telescoping technique due originally to Candes and Recht.
Ok, so how do we prove the RIP? The idea is that R should be well conditioned, and be “incoherent” so that it’s operator norm is much less than its 2-norm.
Recht et al ’07 used a union bound over (a net for) rank-r matrices. This works because Gaussians have great concentration. But Paulis are pricklier.
This work: Use generic chaining (a la Rudelson and Vershynin). This requires proving bounds on covering numbers, which will be done using entropy duality (c.f. Guedon et al 2008).
Here’s a little more detail. If T is a self-adjoint linear map from to , then
define , where
The goal is to show , where comes from the RIP condition.
The main tool is Dudley’s inequality:
Here G is a Gaussian process with and is the # of radius- balls in metric needed to cover U.
We can upper bound N using the trace norm. Let denote the trace-norm ball.
There are two estimates of N. The easy one is that
The harder one is that
This is obtained using entropy duality, with arguments that are somewhat specific to the spaces in question, using techniques of Maurey. See paper (and references 🙂 ) for details.
The central question:
How do you reconstruct a density matrix, with error bars, from measurements?
One framework is “measure and predict.”
We have n+k systems, and measure n (like a training set) and then predict the results of measurement outcomes from the next k (like a testing set).
First compute the likelihood function
From this we can compute a confidence region, point estimates, error bars, and what have you.
How to compute a confidence region?
Let , and let have measure . Then add an extra distance to obtain
Then the main result is the state is in with probability . Crucially the probability is taken over measurement outcomes, and it is important to realize that the outputted confidence interval is itself a random variable. So one cannot say that conditioned on the measurement outcome, we have a high probability of the state being in the confidence region. Without a prior distribution over states, such statements are impossible.
John Preskill, Protected gates for superconducting qubits
This talk was an award lecture for the award that goes by the charmingly Frenglish name of “chaire Aisenstadt chair”. John was introduced by a poem by Patrick!
I enjoyed John’s two opening questions that he imagined the audience would have after seeing the title: “What does this have to do with this conference?” and “Will I understand anything?”
His response was that we shouldn’t worry, since this is a theorist’s conception of superconducting qubits.
Unfortunately, my note-taking quality suffered during this talk, since there was a high density of equations, figures and ideas. So my summary will be breezier. This may become the norm for the remaining days of the conference as well. However, here is an older version of this talk.
As we all know, it’d be great to have a quantum computer. Instead of concatenated FTQC, which has lousy constants, what about physically motivated QECCs? One example is the braiding of nonabelian anyons. Here’s another less studied one. Encoding qubits in harmonic oscillators [“Continuous variable quantum codes”, Gottesman-Kitaev-Preskill 2000].
The rest of the talk was about a particular variant of this idea, called the 0-pi qubit (due to
I have more detailed notes, but they are really rough, and I am saving my energy for the upcoming sessions. In other words, the margin is big enough for the proof, but my glucose level is not.