Update: As Greg points out the way this thing works is pretty dang silly. For me it gives me 108 with my name (10 if I use Dave, and 98 if I use David). Anecdotally this seems off. Why? Well I grew up in a town of just over 6000 and there were two (2!) people with my name. One of them was even my age and had the same middle initial. In fact that’s how my name got changed to Dave from David. The teachers needed some way to distinguish us and so I got Dave and he got David. So 2/6000 times 300 million = 100000. Mmm…argument by anecdote.
Not as Bad as the Best Expert
Tuesday I went to a cool talk by Sanjeev Arora on the multiplicative weight method and, because I should be working on a different calculation, I thought I’d write up a description of a basic version of this algorithm (Mmm…I love the smell of procrastination in the morning.)
Suppose that there is a stock exchange where each day you can bet on whether a stock will go up or down that day. Of course you want to bet correctly on this stock every day so as to earn the maximum amount of moolah. But you don’t know much about stocks and are a bit lazy. But you do have access to a group of [tex]$n$[/tex] experts. Everyday you get to learn these experts predictions for that days movement of the stock price. Now of course, since these are experts, you’d like to do as well as the best of these experts. So what should you do? It seems that knowing what the experts predict shouldn’t help you perform as well as the best of these experts, right? I mean how would you know who the best of these experts is?
Well, it turns out that there is a way to guarantee performing almost as well as the best expert in this game. This is the heart of the multiplicative weight algorithm. Begin by assigning a weight to every expert. Since we don’t know anything at the time of our first bet we might as well assign a weight of one to every expert. Now intuitively what we are going to do is to increase the weight of experts when they correctly predict the outcome of a days movement. Thus we will update our weights as follows. If an expert got the prediction correctly, then we do nothing to the weight. If, on the other hand, the expert was wrong then we multiply the experts corresponding weight by [tex]$1-epsilon$[/tex] where [tex]$0< epsilon< frac{1}{2}$[/tex] is a small number which we will discuss later. Okay, so the weights will now begin to favor experts who are right more. Now how should we exploit this in our own playing of the game? Well each expert will have a prediction for the movement of the stock. What we do is we take a weighted majority of these predictions. That is if the weight of the experts predicting the stock will go up is more than half the total weight of all experts, then we will bet on up. If the weight of the experts predicting the stock will go down is more than half the total weight of all experts, then we will bet on down. A pretty simple algorithm, no? As a we play the game, we change the weights of the experts and bet depending on what the weighted majority of them predict. That's among the simplest strategies we could imagine…so how well does it work?
Now here is the cool thing. Suppose that we play the game [tex]$t$[/tex] times. Let [tex]$m(t)$[/tex] denote the number of times we, using the above algorithm, guess incorrectly (our number of mistakes). Further, let [tex]$m_i(t)$[/tex] denote the number of times expert [tex]$i$[/tex] guesses incorrectly. Then we claim that, for all experts the above algorithm satisfies [tex]$m(t) leq {2 log n over epsilon} +2(1+epsilon) m_i(t)$[/tex]. We'll prove this in a second but note that since this holds for all experts (for all [tex]$i$[/tex]), this means that it holds, in particular for the best expert. In other words, the number of mistakes we make is (ignoring for the moment the first term) at most roughly twice that of the best expert! Pretty cool, no?
How do you prove this? Well it's fairly straighforward. Note that the weight of the [tex]$i$[/tex]th expert after playing [tex]$t$[/tex] times is [tex]$w_i(t)=(1-epsilon)^{m_i(t)}$[/tex]. Define the "potential" function [tex]$phi(t)=sum_i w_i(t)$[/tex] (so that [tex]$phi(1)=n$[/tex].) Now what happens to the potential function as a function of time? Well the potential will always decrease as long as there is one expert who is wrong. What we are interested in is how it will decrease when we make a mistake in our guess, i.e. when by following the weighted majority we are wrong. In this case, at least half of the total weight is decreased by [tex]$1-epsilon$[/tex]. Thus in this case the potential will decrease by at least [tex]$frac{1}{2}+frac{1}{2}(1-epsilon)=1-frac{epsilon}{2}$[/tex]. Thus if we have made [tex]$m(t)$[/tex] mistakes at time [tex]$t$[/tex], this means that the potential at that time must satisfy [tex]$phi(t) leq nleft(1-frac{epsilon}{2} right)^{m(t)}$[/tex]. Now since the weights remain postive the potential function is always as great as one of its constituent weights: [tex]$phi(t) geq w_i(t)$[/tex] for all [tex]$i$[/tex] so that [tex]$w_i(t) leq nleft(1-frac{epsilon}{2} right)^{m(t)}$[/tex]. Putting this together with [tex]$w_i(t)=(1-epsilon)^{m_i(t)}$[/tex] we thus find that [tex]$(1-epsilon)^{m_i(t)} leq nleft(1-frac{epsilon}{2} right)^{m(t)}$[/tex]. Taking the log of this expression we obtain [tex]$m_i(t) log(1-epsilon) leq log n + m(t) log left(1-frac{epsilon}{2} right)$[/tex] and using [tex]$- log(1-x) < x+x^2$[/tex] for [tex]$0<x<frac{1}{2}$[/tex] plus throwing away some [tex]$epsilon$[/tex] corrections, we obtain our desired expression [tex]$m(t) leq {2 log n over epsilon} +2(1+epsilon) m_i(t)$[/tex].
So what use is this multiplicative weight algorithm? Well that was the subject of Sanjeev's talk, describing all sorts of cool applications of this algorithm, including boosting in machine learning theory, approximately solving semidefinite programs, approximately solving fraction packing and covering problems, and online convex optimization. These later tasks were the subjecto Sanjeev's talk and the interested reader is directed to this survey paper.
Pre-Dewiging Pictures
Last week, after going to Innsbruck, I attended the QIPC Workshop in London. The workshop was held at the Royal Society of London and the theme was “Physicists and Computer Scientists Unite!” Scott Aaronson and I were asked to open up the workshop, and, because we are two entirely shameless people we decided that the best way to do this was, given the locale, to replace the debate between physicists and computer scientists with an opening debate between Newton and Leibnitz. Scott has posted our opening dialogue here and the talk which followed it here. And, even more importantly, and to most embarass ourselves, I now present to you incriminating evidence (thanks Viv!) that Scott and I are indeed crazy enough to mock these two great scientists by wearing the appropriate attire:
If you look closely at the following picture, you’ll see that Scott is on the ground groveling before Newton:
And finally, here we make fun of a cookie name
Laugh Therapy
Oh, my brain is sore. Why, oh why do I get suckered into reading things like quant-ph/0610117? Now I could go on and on about this paper, but instead I thought I’d cut and past my favorite parts. The parts that didn’t make me want to send my head straight through my monitor. Mother always said if you can’t say anything nice about a paper, cut and paste the better parts and make funny statements about them. Call it laugh therapy, if you will.
Okay, let’s begin the therapy. This part is funny. It gives what I call an “argument by ignorance”:
On the other hand, the heavy machinery of the theoretical quantum computation with its specific terminology, lemmas, etc, is not readily accessible to most physicists, including myself.
So you want to criticize fault-tolerant quantum computation, but you readily admit that you do not understand it? Ha! That just cracks me up. Is this sort of like the arguments that “math is useless” because “I haven’t used math in years?” Oh, and having read the literature, I’m pretty sure that even a physicist should be able to understand it. I mean, I’m a physicist and I understand it…and I’m not even a string theorist.
Another good part of the paper is reference [19]:
The future quantum engineer is a mythical personage, who will finally achieve factorization of numbers like [tex]$10^{260}$[/tex].
Now, if I hadn’t just read the statement of ignorance which opens up this paper, I might assume that this is an ironic statement. But just having stated that things like “lemmas” are too complicated, I just can’t make that assumption. And when you say that factoring “requires” exponential time, I can’t assume that you know pretty much anything about computer science or discrete math. So I’d like to say that even today there are engineers who can factorize these numbers. Indeed [tex]$10^{260}=2^{260} times 5^{260}$[/tex].
Here is another good one:
Alicki [21] has made a mathematical analysis of the consequences of finite gate duration. I am not in a position to check his math, but I like his result: the fidelity exponentially decreases in time.
Sweet. That’s an argument by ignorance followed by an argument by “I like it!”
A Fox News bifecta!
Argh. #@%@!! Yep, that’s about all I can say.
Quantized Celebrity
I’m sure all of you are well aware of the fact that Carmen Electra recently split up with Dave Navarro. Oh, what? You’re not? And why am I bringing this up? Check out this article. You’ve got wade down to the last paragraph. Okay I’ll wade for you:
In other Electra news in the notable quotables from Star Pulse Carmen drops this wonder:
“I’m really into quantum physics. Some of my friends are into it, some of them aren’t, so I’m trying to get them excited about discovering all these interesting things about thoughts and the power of thoughts.”
“It gives me chills thinking about it. It’s fun.”
Well, okay, I’m pretty sure that “the power of thoughts” and “quantum physics” is Deepak Chopra nonsense, but on the other hand, I’m happy that I share in common with Carmen Electra the belief that thinking about quantum physics is fun.
Philosophia Naturalis 2
The second Philosophia Naturalis has now been posted
We can dance if we want to…
…We can leave your friends behind. ‘Cause your friends don’t dance and if they don’t dance, well they’re no friends of mine.
Yep, that’s right, it’s the new paper dance! In last weeks arxiv listing a new paper by myself and Andrea Cassacino, a graduate student from Siena, Italy, appeared as quant-ph/0610088:
Quantum Error Correcting Subsystem Codes From Two Classical Linear Codes
Authors: Dave Bacon, Andrea Cassacino
Comments: 8 pages, Allerton 2006 conference
The essential insight of quantum error correction was that quantum information can be protected by suitably encoding this quantum information across multiple independently erred quantum systems. Recently it was realized that, since the most general method for encoding quantum information is to encode it into a subsystem, there exists a novel form of quantum error correction beyond the traditional quantum error correcting subspace codes. These new quantum error correcting subsystem codes differ from subspace codes in that their quantum correcting routines can be considerably simpler than related subspace codes. Here we present a class of quantum error correcting subsystem codes constructed from two classical linear codes. These codes are the subsystem versions of the quantum error correcting subspace codes which are generalizations of Shor’s original quantum error correcting subspace codes. For every Shor-type code, the codes we present give a considerable savings in the number of stabilizer measurements needed in their error recovery routines.
The cool thing about this paper is that it officially makes me old. Why? Because now when you click on my name on the arxiv, I now have a second page of listings. Old as the hills, I say!
Ah, the Life of a Theorist
Nothing quite like staring out at the beautiful mountains of Innsbruck while Scott Optimizes and I Pontifficate.
Dave: “Gee I wonder what the meaning of life is?”
Scott: “That’s not a very precise question. Could you please state some conjecture about the meaning of life that I can prove?”
Dave: “Let x be a postive integer….”
[Thanks to Hans Briegel for these amusing pictures.]
Quanteninformation
Now this
is what the view from every office should look like. I’m currently in Innsbruck, Austria visiting the Institut fur Quantenoptik und Quanteninformation. One of the few place in the world where you can study quantum computing and then go skiing. Or if you want to, you can go skiing and then study quantum computing. Of course my favorite is to ski and do quantum computing at the same time. This is fine just as long as you realize that putting yourself in a superposition of going to the left of a tree and to the right of a tree will most probably result in you dying.
Scott is here as well and tomorrow the poor denizens will get a double feature of our talks.
Interference of the Nonfootball Kind
Interference can occur in both classical and quantum systems. In the classical world we are well acclimatized to thinking of water waves or electromagnetic waves or slinky waves interfering. In the quantum world interference plays a particularly central role, as we learn that we can at best calculate the amplitudes of events and constructive and destructive interference is essential in understanding the speedups of quantum algorithms. The difference between the classical world and the quantum world cannot just be summed up to “interference.” However it is also clear that interference is an essential difference between quantum computers and probabilistic classical computers. So understading interference seems to be an important task for thinking about quantum algorithms.
But when we think about interference in a classical world, my picture is almost always a picture with classical wave equations. Is there a simple computer sciencish model which shows interference which isn’t based on wave equations or some other “physical” wave? Here is one such example.
Consider the quintisential quantum interference experiment. Start a single qubit in the state [tex]$|0rangle$[/tex]. Now if we apply the Hadmard transform [tex]$frac{1}{sqrt{2}}left[ begin{array}{cc} 1 & 1 \ 1 & -1 end{array}right]$[/tex] to this qubit, we now have the state [tex]$frac{1}{sqrt{2}}(|0rangle+|1rangle)$[/tex]. Measurement of this state would then result in 50 percent probability of [tex]$|0rangle$[/tex] and 50 percent probability of [tex]$|1rangle$[/tex]. But suppose, that instead of performing this measurement, we apply another Hadamard. This will result in the state [tex]$|0rangle$[/tex] and measurement will now result in this state with certainty. So this is kind of strange. Suppose that you wanted to replace the above description of the above quantum evolution by a single classical bit which evolves stochastically. So you are looking for an operation which when applied to a bit in the state 0 turns this state into a 50/50 mixture of the states 0 and 1. But applying this operation twice results back in the state 0. You can quickly convince yourself that there is no 2 by 2 stochastic matrix with this property. We cannot replace a single qubit by a single bit if we are to assume that each quantum gate is mapped to a stochastic map on a bit. Of course, what we are seeing here is interference: in the quantum case, the two paths that the qubit takes, one throught [tex]$|0rangle$[/tex] and the other through [tex]$|1rangle$[/tex] interfere in such a way that we end up back in the [tex]$|0rangle$[/tex] state. Since probabilities are always positive, we cannot achieve a similar evolution using stocahstic evolutions on a single bit.
But wait! What if we wanted to model this qubit experiment by a classical experiment in which we replace the qubit by two classical bits. Now what can we do? Here is an idea. We need some randomness. So lets make one of the two classical bits a bit which is a 50/50 mixture of the two states 0 and 1. Now suppose that we use the second bit to represent the state we are measuring. I.e. in the above experiment when we start in the [tex]$|0rangle$[/tex] quantum state, this means starting in the second classical bit in the state 0. Now replace the Hadamard operation in the classical case with a classical operation which swaps the two bits. Certainly if we apply this classical gate once, and measure the second bit we end up in a 50/50 mixture and if we apply the swap twice, since swaping twice is identy, if we measure the second bit it will now be back to 0! So we’ve mimiced the above quantum interference experiment using two classical bits.
Well, coming from the quantum world, you might wonder ah but what about the following experiment. Suppose that you again start your quantum world in the state [tex]$|0rangle$[/tex], apply a Hadamard and then apply a Pauli [tex]$Z=left[begin{array}{cc} 1 & 0 \ 0 & -1 end{array} right]$[/tex] gate. Now, of course, if you make a measurement you again end up in a 50/50 mixture of the states [tex]$|0rangle$[/tex] and [tex]$|1rangle$[/tex]. However, if you forgoe this measurement and apply instead now a Hadmard, you will end up in the state [tex]$|1rangle$[/tex]. So by applying a gate [tex]$Z$[/tex] which does nothing to the probabilities had you measured the system, you’ve changed the state to [tex]$|1rangle[/tex] after the Hadmard. Very strange, but we know that this is nothing more than the effect of changing the phase of one of the states such that what was previously constructive interference becomes destructive interference (and vice versa). The obvious question is whether we can now mimic this using our two classical bits. Of course! We do this by, when we apply the [tex]$Z$[/tex] quantum gate, applying a bit flip to the first bit. Thus in the above experiment we replace Hadamard [tex]$Z$[/tex] Hadamard by swap, bit flip on the first bit, swap. This will, if we start the second bit in 0 produce the final result of 1, as desired, and if we were to measure the system after the first swap or first swap and bit flip on the first bit, a 50/50 mixture as desired.
Can this be extended even further? Yep. First notice that if we look at the group of reversible operations on 2 classical bits, this is nothing more than the permutation group of four objects (the states 00, 01, 10, and 11.) There are 24 elements to this group. Now the Pauli group on a single qubit, if you forget about a global phase, which for a single qubit can have no observable effect, has only four different elements (note that moding out this phase does not produce a group since [tex]$XZ=-ZX$[/tex], so what I’m describing is a set of elements which act in an observable fashion.) The Clifford group elements which are not Pauli group elements themselves acts as the automorphism group on the Pauli group. Indeed it acts, modulo again a global phase, to permute the three nontrivial elements of Pauli group. Thus it is, modulo again a global phase, nothing more than 3!=6 permutations. The total size of the Clifford group, when we mod out the effect of a global phase is 4 times 6 or 24. But that is the size of the permutation group of four objects. Indeed you can work out that because of this, you can mimic exactly the Clifford group gates using the two bit construction detailed above. Kind of cool, no?
So what we’ve got here is a classical model of the Clifford group gates, a set of gates which exhibit interference, but where we have replaced our qubit by two bits. This second replacement doesn’t look much like a classical wave, or at least not one I’m used to. Which is what I was looking for. So interference can be understood classically here. It is nothing more than the fact that we do not measure the full state of the classical system, which can lead to effects which look effectively like constructive and destructive interference. Since we don’t have complete information about the system by just knowing the state of the second bit, we can observe a classic example of quantum interference.
[Those of you who are experts may immediately notice Spekken’s toy model and the Gottesman-Knill theorem. But if you’re an expert why are you reading my silly meandering writings? Get back to work…Carlos!]