OMG My Classical Probability Distrubution Collapsed!

Scott Aaronson has a nice diatribe “Are Quantum States Exponentially Long Vectors?” which he’s posted on the arXiv as quant-ph/0507242. In this note he discusses his own personal counter to certain objections to quantum computation. It’s a very nice read.
My favorite part of the article is where Scott comes out as a full on epistemologist:

To describe a state of n particles, we need to write down an exponentially long vector of exponentially small numbers, which themselves vary continuously. Moreover, the instant we measure a particle, we “collapse” the vector that describes its state—and not only that, but possibly the state of another particle on the opposite side of the universe. Quick, what theory have I just described?
The answer is classical probability theory. The moral is that, before we throw up our hands over the “extravagance” of the quantum worldview, we ought to ask: is it so much more extravagant than the classical probabilistic worldview?

To which I can only say “Amen, brother!” I think physicists, in particular, are VERY bad at understanding this argument.
Suppose we want to write down a theory which describes the state of classical bits. One can certainly pretend that the classical bits are always in some definite state, but now ask how do we describe the state of our classical bits when we carry out an operation like, flip a fair coin, and conditional on the outcome set a bit to zero or one? We then need probabilities to describe out classical set of bits. If we have n classical bits, then the probability vector describing such a classical system will be made up of two to the power n numbers (the probabilities.) The number of numbers needed to describe a classical n bit system is exponential in the number of bits! So should we be surprised that quantum computing requires states described by an exponential numbers of complex amplitudes? Doesn’t seem as surprising now, does it?
And there are a bunch of other similarities between probabilistic computation and quantum computation. If we measure such a classical system, we certainly get one of the bit strings, and out description immediately changes to a probability distribution with only one nonzero entry: the probability distribution collapses. Similarly if we perform a single measurement, we don’t learn the probabilities themselves, i.e. we don’t learn these (real) numbers describing the classical state.
Another interesting analogy (which can only be pushed so far…and this is the real interesting part!) is with correlated bits. Suppose I flip a fair coin and if the outcome is heads I put two bits which are both zero into two boxes. If the outcome is tails, I put two bits which are both one into two boxes. What is our description of the classical probabilistic state of these two boxes? We say 50% 00 and 50% 11. Now carry these boxes to the far ends of the universe. Open one of the boxes. Well, opening this box, I immediately know that whatever is in this box, well the other bit, on the other side of the universe, well it must have the same value as my bit. Communication faster than light? No! Correlated bits? Yes! As a global observor, we can update our description of the system after a measurement by appropriately collapsing the probability distribution. Notice that until information is communicated about the measurement from one party to the other, the left out party can’t change his/her description of their system (or of the global system). Quantum entanglement is a “bit” like this…but the surprising thing is that it turns out to be different! How different? Well this is the subject of Bell’s theorem and, needless to say the end result is one of the essential differences between classical probabilistic computation and quantum computation. But the fact that quantum theory is a consistent way to describe probability amplitudes is directly analogous to the manner in which classical probabilistic description work!
There are even more similarities between quantum computation and probabilistic classical computation. For example, there is a classical analogy of teleportation. It works out to be one time pads!
Notice that to get these interpretations of the similarites between classical probabilistic computation and quantum computation, we need to adopt a particular stance towards quantum theory. This is the epistemological view of quantum theory. In this view, roughly, the wave function of a quantum system is merely a description of a quantum system. It is not, say, like the classical position of a particle, which is a real number which we can really assign as a property of that classical system. I must say that I find myself very much in tune with this view of quantum theory. This does not mean, however, that this point of view totally solves all the problems people have with quantum theory. In particular, the problems of contextuality and no local hidden variable theory remain “troubling” and the question of “a description of WHAT?” is roughly the measurement problem. I certainly think that among quantum computing theorists, roughly this point of view is gaining more and more adherents. Which is good, because any mention of many worlds is destined to make me go crazy!
As a side note, when I started teaching the quantum computing course this summer, I attempted to teach quantum theory from the epistemological point of view. Unfortunately, the pace I set was too fast, and so I had to change tactics. But it certainly would be interesting to try to teach quantum theory from this perspective.
A final quote from Scott:

For almost a century, quantum mechanics was like a Kabbalistic secret that God revealed to Bohr, Bohr revealed to the physicists, and the physicists revealed (clearly) to no one. So long as the lasers and transistors worked, the rest of us shrugged at all the talk of complementarity and wave-particle duality, taking for granted that we’d never understand, or need to understand, what such things actually meant. But today—largely because of quantum computing—the Schr¨odinger’s cat is out of the bag, and all of us are being forced to confront the exponential Beast that lurks inside our current picture of the world. And as you’d expect, not everyone is happy about that, just as the physicists themselves weren’t all happy when they first had to confront it the 1920’s.

Which I really like, but I must take issue with. It’s all the physicist’s fault for not clearly communicating?! I don’t think so. I think computer scientists were too busy with other important things, like, say inventing the modern computer and building modern complexity theory, to even bother coming over and talking with us physicists about quantum theory. Just because you weren’t paying attention doesn’t mean you get to say that physicists weren’t communicating clearly! Notice that it was basically three physicists, Benioff, Feynman, and Deutsch, who first really raised the question of what exactly a quantum computer would be. Of course it took computer scientists, like Bernstein, Vazirani, Simon, and Shor to actually show us the path forward! But I think someone just as easily could have thought up quantum computation in 1950 as in 1980. The reason why it took so long to dream up quantum computers probably has more to do with the fact that no one, physicists or computer scientists, could really imagine doing the kinds of experiments which quantum computers represent. Of course, none of this really matters, but it’s fun to yell and scream about this and pretend that it makes some sort of difference, when really its just fun and funny.

Make It Planar

Steve Flammia points me to this cool game. Well at least it’s cool if you are the computer science type.

Beyond Moore's Law

If Moore’s law continues at its current pace, sometime between 2040 and 2050 the basic elements of a computer will be atomic sized. And even if Moore’s law slows, we eventually hope that computers will be made of components which are atomic sized. Either way, we really believe that it might be possible to get to “the end of Moore’s curve.” A question I like to ask (especially to those employed in the computer industry) is “What will happen to your job when we hit the atomic size barrier for computer components?” (or more interestingly, “will you still have a job when Moore’s law ends?” Yes, I know, software is important, architecture is important, etc. I still think the end of Moore’s law will result in major changes in the industry of computers.)
One question which comes up when we think about the end of Moore’s law is that in some sense, the end of Moore’s law that we’re talking about is the end of a particular manner of creating fast computing devices. Mostly we are thinking about the end of silicon based integrated circuits, and even more broadly we are thinking about the end of transistor based computers (i.e. we include both silicon based circuits and also molecular transistors, etc.) And the fundamental speed of these devices is rooted in physics. So what can physics tell us about what lies beyond Moore’s law?
Well first of all, the length scales involved in the traditional version of Moore’s law are atomic length scales. The barrier we are hitting is basically the barrier set by the laws of atomic physics. But we know, of course, that there are smaller length scales possible. In particular the next step down the ladder of sizes is to go to nuclear length scales. But we also need to say something about the speeds of operations. What are the limits to the speeds of gates which an atomic physics based computer can operate at? Interestingly, there is often a lot of confusion about this question. For example, suppose you are trying to drive an atomic transition (this is nothing like our transistors, but bare with me.) with your good old tabletop laser. The speed at which you can drive this transition is related to the intensity of the laser beam. So it might seem, at first guess that you can just keep cranking up the intensity of the laser beam to get faster and faster transitions. But eventually this will fail. Why? Because as you turn up the intensity of the laser beam you also increase the probability that your system will make a transtion to a state you don’t want it to be in. This may be another energy level, or it may be that you blow the atom appart, or you blow the atom out of whatever is keeping it in place, etc. Now, generally the intensity of the laser beam at which this becomes important is related to the energy spacing in the atomic system (if, say you are trying not to excite to a different level.) Note that the actually energy spacing of the levels you are driving is NOT the revelant information, but most of the time, this spacing is the same order of magnitude of the spacings to levels you are trying to avoid. So this allows us to, roughly, argue that the maximum speed we will achieve for our transition is Plancks constant divided by the energy spacing.
Now for atomic systems, the energy levels we are talking about might be, say a few electron Volts. So we might expect that our upper limit of speeds from our gate is something like 10^(15) Hz. Note that today’s computers, which don’t operate by driving atomic transitions, but in a different manner, operate with clock speeds of 10^(9) Hz (yeah, yeah, clock speed is no guarantee of instructions per second, but I’m a physicist, so order of magnitude is my middle name.) Only 6 more orders of magnitude to go!
So what does happen if we hit the end of atomic sized computing devices? As I mentioned the next step on the length scale slash energy scale are nuclear systems. Here we find energy scales which are typically millions of electron Volts. But I have absolutely no idea how to build a computer where internal nuclear states are used to compute. Which, doesn’t mean that it’s impossible, of course (which reminds me of a great quote by the late great John Bell: “what is proved by impossibility proofs is lack of imagination.”) So there’s a good problem for a nuclear physicist with a few spare moments: think up a method for computing using nuclear transitions.
One can continue up the energy scale, of course. But now it gets even more far out to imagine how to get the device to compute. Is it possible to turn the large hadron collider, currently being built at CERN, into a computer opperating at 10^(27) Hz (energies of terra electron Volts)? Now THAT would be a fast computer!

A Fork In the Road for Ion Traps

Big news for ion trap quantum computers. It seems that Christopher Monroe’s ion trap group at the University of Michigan has suceeded in getting ions to shuttle around the corner of a T in their ion traps (their news item is dated 6/11/05, for this result.) This is, needless to say, a crucial step in building a “reasonable” architecture for quantum computing. This kind of thing makes me want to give up my theory license and jump into the lab!

Paper and Book Roundup

Some interesting papers.
First, a paper by Andrew Childs and Wim van Dam, “Quantum algorithm for a generalized hidden shift problem”, quant-ph/0507190 which gives a very nice, new algorithm for, well, for what it says: hidden shift problems! Interestingly their new algorithm uses Lenstra’s classical integer programing algorithm to implement an entangled measurement on the quantum states they set up. I just started reading the paper this morning. Once I parse it, I may have more to post.
Another interesting paper, is “Rigorous location of phase transitions in hard optimization problems” which is, amazingly, a computer science article published in…Nature. If you read this paper and are a physicist, it will make you very proud:

Our results prove that the heuristic predictions of statistical physics in this context are essentially correct.

In other words…yeah the physicists are actually really good at guessing what approximations to make! The paper is nice as well, rigorously proving some nice properties of random instances of certain NP-complete problems.
Finally, I received in the mail yesterday “Probability Theory” by E.T. Jaynes. This book, in incomplete form, had been available on the web for many years. Following Jaynes’ death, G. Larry Bretthorst was able to collect some (but not all) of this material into “Probability Theory.” Unfortunately, Jaynes’ had intended to have two volumes, and it seems that the second volume was woefuly incomplete and so will not be published.

Got Quantum Problems?

Scott Aaronson has written a nice article “Ten Semi-Grand Challenges for Quantum Computing Theory”. If you are a computer science theory researcher interested in what to work on in quantum computing, I highly recommend the list. One thing I find very interesting about theory work in computer science is how religious researchers are about not sharing the problems they are working on. So it is very nice of Scott to share what he thinks are big open problems in quantum computing theory today.
One thing that Scott leaves out of the list, which I would have included, are questions along the lines of “how can quantum computing theory contribute to classical computing theory.” Scott explicitly says he does not include this question because it is “completely inexhaustible,” and I agree that this is certainly true, but this may be exactly the reason one should work on it! The idea behind this line of research is to prove results in classical computational theory by insights gained from quantum computational theory. An analogy which may or may not be stretching things a bit is the relationship between real analysis and complex analysis. Anyone who has studied these two subject knows that real analysis is much more difficult than complex analysis. Physicists best know this in that they often cannot easily do certain real integrals unless they pretend their real variables are complex and integrate along a particularly well choosen countour. Similarly, many results in real analysis have counterparts in complex analysis which are easy to prove. So the line of research which asks whether quantum computation can contribute to classical computation is basically “as complex analysis is to real analysis, so quantum computing is to classical computer.” I’ve discussed this possibility (along with Scott’s contribution to it) here.

Best Title Ever? A New Nomination

On the quant-ph arXiv, today, we find, Steven van Enk having way too much fun:

Quantum Physics, abstract
quant-ph/0507189

From: Steven J. van Enk [view email]
Date: Tue, 19 Jul 2005 19:10:35 GMT   (2kb)

|0>|1>+|1>|0>

Authors:
S.J. van Enk
Comments: 1.1 page, unnormalized title

is entangled. (There is nothing more in the abstract, but quant-ph does not
accept 2-word abstracts, apparently.)

Full-text: PostScript, PDF, or Other formats

Erasing Landauer's principle,

Three Toed Sloth (who has been attending the complex systems summer school in China which I was supposed to attend before my life turned upside down and I ran off to Seattle) has an interesting post on Landauer’s principle. Landauer’s principle is roughly the principle that erasing information in thermodynamics disipates an amount of entropy equal to Bolztman’s constant times the number of bits erased. Cosma points to two papers, Orly Shenker’s “Logic and Entropy”, and John Norton’s “Eaters of the Lotus”, which both claim problems with Landaur’s principle. On the bus home I had a chance to read both of these papers, and at least get an idea of what the arguments are. Actually both articles point towards the same problem.
Here is a simplistic approach to Landaur’s principles. Suppose you have a bit which has two values of a macroscopic property which we call 0 and 1. Also suppose that there are other degrees of freedom for this bit (like, say, the pressure of whatever is physically representing the bit). Now make up a phase space with one axis representing the 0 and 1 variables and another axis representing these degrees of freedom. Actually lets fix this extenral degree of freedom to be the pressure, just to make notation easier. Imagine now the process which causes erasure. Such a process will take 0 to 0, say, and 1 to 0. Now look at this processs in phase space. Remember that phase space volumes must be conserved. Examine now two phase space volumes. One corresponds to the bit being 0 and some range of the pressure. The other corresponds to the bit being 1 and this same range of pressures. In the erasure procedure, we take 1 to 0, but now, because phase space volume must be preserved, we necesarily must change the values of the extra degree of freedom (the pressure), because we can’t map the 1 plus range of pressures region to the 0 plus the same range of pressures because this latter bit of phase space is already used up. What this necesitates is an increase of entropy, which at its smallest can be k ln 2.
From my quick reading of these articles, their issue is not so much with this argument, per se, but with the interpretation of this argument (by which I mean they do not challenge the logical consistency of Laundauer and other’s formulations of the principle, but challenge instead the interpretation of the problem these authors claim to be solving.) In both articles we find the authors particularly concerned with how to treat the macroscopic variables corresponding to the bits 0 and 1. In particular they argue that implicit in the above type argument is that we should not treat these macroscopic variables as thermodynamic-physical magnitudes. The author of the first paper makes this explicilty clear by replacing the phase space picture I’ve presented above by two pictures, one in which the bit of information is 0 and one in which the bit of information is 1 and stating things like “A memory cell that – possibly unknown to us – started out in macrostate 0 will never be in macrostate 1″ (emphasis the authors.) The authors of the second article make a similar point, in particular pointing out that “the collection of cells carrying random data is being treated illicitly as a canonical ensemble.”
What do I make of all this? Well I’m certainly no expert. But it seems to me that these arguments center upon some very deep and interesting problems in the interpretation of thermodynamics, and also, I would like to argue, upon the fact that thermodynamics is not complete (this may even be as heretical as my statement that thermodynamics is not universal, perhaps it is even more heretical!) What do I mean by this? Consider, for example, one of our classical examples of memory, the two or greater dimensional ferromagnetic Ising model. In such a model we have a bunch of spins on a lattice with interactions between nearest neighbors which have lower energy when the spins are aligned. In the classical thermodynamics of this system, above a certain critical temperature, in thermodynamic equibrium, the total magnetization of this system is zero. Below this temperature, however, something funny happens. Two thermodyanmic equilibrium states appear, one with the magnetization pointing mostly in one direction and one with the magnetization point mostly in another direction. These are the two states into which we “store” information. But, when you think about what is going on here, this bifurcation into two equibria, you might wonder about the “completeness” of thermodynamics. Thermodynamics does not tell us which of these states is occupied, nor even that, say each are occupied with equal probability. Thermodynamics does not give us the answer to a very interesting question, what probability distribution for the bit of stored information!
And it’s exactly this question to which the argument about Landauer’s principle resolves. Suppose you decide that for the quantities, such as the total magnetic field, you treat these as two totally separate settings with totally different phase spaces which cannot be accessed at the same time. Then you are lead to the objections to Landauer’s principle sketched in the two papers. But now suppose that you take the point of view that thermodynamics should be completed in some way such that it takes into account these two macroscopic variables as real thermodynamic physical variables. How to do this? The point, I think many physicist would make, it seems, is that no matter how you do this, once you’ve got them into the phase space, the argument presented above will procedure a Landauer’s principle type argument. Certainly one way to do this is to assert that we don’t know which of the states the system is in (0 or 1), so we should assign these each equal probability, but the point is that whatever probability assumption you make, you end up with a similar argument. in terms of phase space volume. Notice also that really to make these volumes, the macroscopic variables should have some “spread”: i.e. what we call 0 and 1 are never precisely 0 and 1, but instead are some region around magnetization all pointing in one direction and some region around magnetization pointing in another direction.
I really like the objections raised in these articles. But I’m not convinced that either side has won this battle. One interesting thing which I note is that the argument against Laundauer’s principle treats the two macrostates 0 and 1 in a very “error-free” manner. That is to say they treat these variables are really digital values. But (one last heresy!) I’m inclided to believe that nothing is perfectly digital. The digital nature of information in the physical world is an amazingly good approximation for computers….but it does fail. If you were able to precisely measure the information stored on your hard drive, you would not find zeros and ones, but instead zeros plus some small fluctuation and ones plus some small fluctuations. Plus, if there is ever an outside environment which is influencing the variable you are measuring, then it is certainly true that eventually your information, in thermodynamics, will disappear (see my previous article on Toom’s rule for hints as to why this should be so.) So in that case, the claim that these two bit states should never be accessible to each other, clearly breaks down. So I’m a bit worried (1) about the arguments against Laundauer’s principle from the point of view that digital information is only an approximation, but also (2) about arguements for Laundauer’s principle and the fact that they might somehow depend on how one completes thermodynamics to talk about multiple eqiulibria.
Of course, there is also the question of how all this works for quantum systems. But then we’d have to get into what quantum thermodynamics means, and well, that’s a battle for another day!
Update: be sure to read Cris Moore’s take on these two papers in the comment section. One thing I didn’t talk about was the example Shenker used against Laundauer’s principle. This was mostly because I didn’t understand it well enough and reading Cris’s comments, I agree with him that this counterexample seems to have problems.