Beyond Moore's Law

If Moore’s law continues at its current pace, sometime between 2040 and 2050 the basic elements of a computer will be atomic sized. And even if Moore’s law slows, we eventually hope that computers will be made of components which are atomic sized. Either way, we really believe that it might be possible to get to “the end of Moore’s curve.” A question I like to ask (especially to those employed in the computer industry) is “What will happen to your job when we hit the atomic size barrier for computer components?” (or more interestingly, “will you still have a job when Moore’s law ends?” Yes, I know, software is important, architecture is important, etc. I still think the end of Moore’s law will result in major changes in the industry of computers.)
One question which comes up when we think about the end of Moore’s law is that in some sense, the end of Moore’s law that we’re talking about is the end of a particular manner of creating fast computing devices. Mostly we are thinking about the end of silicon based integrated circuits, and even more broadly we are thinking about the end of transistor based computers (i.e. we include both silicon based circuits and also molecular transistors, etc.) And the fundamental speed of these devices is rooted in physics. So what can physics tell us about what lies beyond Moore’s law?
Well first of all, the length scales involved in the traditional version of Moore’s law are atomic length scales. The barrier we are hitting is basically the barrier set by the laws of atomic physics. But we know, of course, that there are smaller length scales possible. In particular the next step down the ladder of sizes is to go to nuclear length scales. But we also need to say something about the speeds of operations. What are the limits to the speeds of gates which an atomic physics based computer can operate at? Interestingly, there is often a lot of confusion about this question. For example, suppose you are trying to drive an atomic transition (this is nothing like our transistors, but bare with me.) with your good old tabletop laser. The speed at which you can drive this transition is related to the intensity of the laser beam. So it might seem, at first guess that you can just keep cranking up the intensity of the laser beam to get faster and faster transitions. But eventually this will fail. Why? Because as you turn up the intensity of the laser beam you also increase the probability that your system will make a transtion to a state you don’t want it to be in. This may be another energy level, or it may be that you blow the atom appart, or you blow the atom out of whatever is keeping it in place, etc. Now, generally the intensity of the laser beam at which this becomes important is related to the energy spacing in the atomic system (if, say you are trying not to excite to a different level.) Note that the actually energy spacing of the levels you are driving is NOT the revelant information, but most of the time, this spacing is the same order of magnitude of the spacings to levels you are trying to avoid. So this allows us to, roughly, argue that the maximum speed we will achieve for our transition is Plancks constant divided by the energy spacing.
Now for atomic systems, the energy levels we are talking about might be, say a few electron Volts. So we might expect that our upper limit of speeds from our gate is something like 10^(15) Hz. Note that today’s computers, which don’t operate by driving atomic transitions, but in a different manner, operate with clock speeds of 10^(9) Hz (yeah, yeah, clock speed is no guarantee of instructions per second, but I’m a physicist, so order of magnitude is my middle name.) Only 6 more orders of magnitude to go!
So what does happen if we hit the end of atomic sized computing devices? As I mentioned the next step on the length scale slash energy scale are nuclear systems. Here we find energy scales which are typically millions of electron Volts. But I have absolutely no idea how to build a computer where internal nuclear states are used to compute. Which, doesn’t mean that it’s impossible, of course (which reminds me of a great quote by the late great John Bell: “what is proved by impossibility proofs is lack of imagination.”) So there’s a good problem for a nuclear physicist with a few spare moments: think up a method for computing using nuclear transitions.
One can continue up the energy scale, of course. But now it gets even more far out to imagine how to get the device to compute. Is it possible to turn the large hadron collider, currently being built at CERN, into a computer opperating at 10^(27) Hz (energies of terra electron Volts)? Now THAT would be a fast computer!

A Fork In the Road for Ion Traps

Big news for ion trap quantum computers. It seems that Christopher Monroe’s ion trap group at the University of Michigan has suceeded in getting ions to shuttle around the corner of a T in their ion traps (their news item is dated 6/11/05, for this result.) This is, needless to say, a crucial step in building a “reasonable” architecture for quantum computing. This kind of thing makes me want to give up my theory license and jump into the lab!

Paper and Book Roundup

Some interesting papers.
First, a paper by Andrew Childs and Wim van Dam, “Quantum algorithm for a generalized hidden shift problem”, quant-ph/0507190 which gives a very nice, new algorithm for, well, for what it says: hidden shift problems! Interestingly their new algorithm uses Lenstra’s classical integer programing algorithm to implement an entangled measurement on the quantum states they set up. I just started reading the paper this morning. Once I parse it, I may have more to post.
Another interesting paper, is “Rigorous location of phase transitions in hard optimization problems” which is, amazingly, a computer science article published in…Nature. If you read this paper and are a physicist, it will make you very proud:

Our results prove that the heuristic predictions of statistical physics in this context are essentially correct.

In other words…yeah the physicists are actually really good at guessing what approximations to make! The paper is nice as well, rigorously proving some nice properties of random instances of certain NP-complete problems.
Finally, I received in the mail yesterday “Probability Theory” by E.T. Jaynes. This book, in incomplete form, had been available on the web for many years. Following Jaynes’ death, G. Larry Bretthorst was able to collect some (but not all) of this material into “Probability Theory.” Unfortunately, Jaynes’ had intended to have two volumes, and it seems that the second volume was woefuly incomplete and so will not be published.

Got Quantum Problems?

Scott Aaronson has written a nice article “Ten Semi-Grand Challenges for Quantum Computing Theory”. If you are a computer science theory researcher interested in what to work on in quantum computing, I highly recommend the list. One thing I find very interesting about theory work in computer science is how religious researchers are about not sharing the problems they are working on. So it is very nice of Scott to share what he thinks are big open problems in quantum computing theory today.
One thing that Scott leaves out of the list, which I would have included, are questions along the lines of “how can quantum computing theory contribute to classical computing theory.” Scott explicitly says he does not include this question because it is “completely inexhaustible,” and I agree that this is certainly true, but this may be exactly the reason one should work on it! The idea behind this line of research is to prove results in classical computational theory by insights gained from quantum computational theory. An analogy which may or may not be stretching things a bit is the relationship between real analysis and complex analysis. Anyone who has studied these two subject knows that real analysis is much more difficult than complex analysis. Physicists best know this in that they often cannot easily do certain real integrals unless they pretend their real variables are complex and integrate along a particularly well choosen countour. Similarly, many results in real analysis have counterparts in complex analysis which are easy to prove. So the line of research which asks whether quantum computation can contribute to classical computation is basically “as complex analysis is to real analysis, so quantum computing is to classical computer.” I’ve discussed this possibility (along with Scott’s contribution to it) here.

Erasing Landauer's principle,

Three Toed Sloth (who has been attending the complex systems summer school in China which I was supposed to attend before my life turned upside down and I ran off to Seattle) has an interesting post on Landauer’s principle. Landauer’s principle is roughly the principle that erasing information in thermodynamics disipates an amount of entropy equal to Bolztman’s constant times the number of bits erased. Cosma points to two papers, Orly Shenker’s “Logic and Entropy”, and John Norton’s “Eaters of the Lotus”, which both claim problems with Landaur’s principle. On the bus home I had a chance to read both of these papers, and at least get an idea of what the arguments are. Actually both articles point towards the same problem.
Here is a simplistic approach to Landaur’s principles. Suppose you have a bit which has two values of a macroscopic property which we call 0 and 1. Also suppose that there are other degrees of freedom for this bit (like, say, the pressure of whatever is physically representing the bit). Now make up a phase space with one axis representing the 0 and 1 variables and another axis representing these degrees of freedom. Actually lets fix this extenral degree of freedom to be the pressure, just to make notation easier. Imagine now the process which causes erasure. Such a process will take 0 to 0, say, and 1 to 0. Now look at this processs in phase space. Remember that phase space volumes must be conserved. Examine now two phase space volumes. One corresponds to the bit being 0 and some range of the pressure. The other corresponds to the bit being 1 and this same range of pressures. In the erasure procedure, we take 1 to 0, but now, because phase space volume must be preserved, we necesarily must change the values of the extra degree of freedom (the pressure), because we can’t map the 1 plus range of pressures region to the 0 plus the same range of pressures because this latter bit of phase space is already used up. What this necesitates is an increase of entropy, which at its smallest can be k ln 2.
From my quick reading of these articles, their issue is not so much with this argument, per se, but with the interpretation of this argument (by which I mean they do not challenge the logical consistency of Laundauer and other’s formulations of the principle, but challenge instead the interpretation of the problem these authors claim to be solving.) In both articles we find the authors particularly concerned with how to treat the macroscopic variables corresponding to the bits 0 and 1. In particular they argue that implicit in the above type argument is that we should not treat these macroscopic variables as thermodynamic-physical magnitudes. The author of the first paper makes this explicilty clear by replacing the phase space picture I’ve presented above by two pictures, one in which the bit of information is 0 and one in which the bit of information is 1 and stating things like “A memory cell that – possibly unknown to us – started out in macrostate 0 will never be in macrostate 1″ (emphasis the authors.) The authors of the second article make a similar point, in particular pointing out that “the collection of cells carrying random data is being treated illicitly as a canonical ensemble.”
What do I make of all this? Well I’m certainly no expert. But it seems to me that these arguments center upon some very deep and interesting problems in the interpretation of thermodynamics, and also, I would like to argue, upon the fact that thermodynamics is not complete (this may even be as heretical as my statement that thermodynamics is not universal, perhaps it is even more heretical!) What do I mean by this? Consider, for example, one of our classical examples of memory, the two or greater dimensional ferromagnetic Ising model. In such a model we have a bunch of spins on a lattice with interactions between nearest neighbors which have lower energy when the spins are aligned. In the classical thermodynamics of this system, above a certain critical temperature, in thermodynamic equibrium, the total magnetization of this system is zero. Below this temperature, however, something funny happens. Two thermodyanmic equilibrium states appear, one with the magnetization pointing mostly in one direction and one with the magnetization point mostly in another direction. These are the two states into which we “store” information. But, when you think about what is going on here, this bifurcation into two equibria, you might wonder about the “completeness” of thermodynamics. Thermodynamics does not tell us which of these states is occupied, nor even that, say each are occupied with equal probability. Thermodynamics does not give us the answer to a very interesting question, what probability distribution for the bit of stored information!
And it’s exactly this question to which the argument about Landauer’s principle resolves. Suppose you decide that for the quantities, such as the total magnetic field, you treat these as two totally separate settings with totally different phase spaces which cannot be accessed at the same time. Then you are lead to the objections to Landauer’s principle sketched in the two papers. But now suppose that you take the point of view that thermodynamics should be completed in some way such that it takes into account these two macroscopic variables as real thermodynamic physical variables. How to do this? The point, I think many physicist would make, it seems, is that no matter how you do this, once you’ve got them into the phase space, the argument presented above will procedure a Landauer’s principle type argument. Certainly one way to do this is to assert that we don’t know which of the states the system is in (0 or 1), so we should assign these each equal probability, but the point is that whatever probability assumption you make, you end up with a similar argument. in terms of phase space volume. Notice also that really to make these volumes, the macroscopic variables should have some “spread”: i.e. what we call 0 and 1 are never precisely 0 and 1, but instead are some region around magnetization all pointing in one direction and some region around magnetization pointing in another direction.
I really like the objections raised in these articles. But I’m not convinced that either side has won this battle. One interesting thing which I note is that the argument against Laundauer’s principle treats the two macrostates 0 and 1 in a very “error-free” manner. That is to say they treat these variables are really digital values. But (one last heresy!) I’m inclided to believe that nothing is perfectly digital. The digital nature of information in the physical world is an amazingly good approximation for computers….but it does fail. If you were able to precisely measure the information stored on your hard drive, you would not find zeros and ones, but instead zeros plus some small fluctuation and ones plus some small fluctuations. Plus, if there is ever an outside environment which is influencing the variable you are measuring, then it is certainly true that eventually your information, in thermodynamics, will disappear (see my previous article on Toom’s rule for hints as to why this should be so.) So in that case, the claim that these two bit states should never be accessible to each other, clearly breaks down. So I’m a bit worried (1) about the arguments against Laundauer’s principle from the point of view that digital information is only an approximation, but also (2) about arguements for Laundauer’s principle and the fact that they might somehow depend on how one completes thermodynamics to talk about multiple eqiulibria.
Of course, there is also the question of how all this works for quantum systems. But then we’d have to get into what quantum thermodynamics means, and well, that’s a battle for another day!
Update: be sure to read Cris Moore’s take on these two papers in the comment section. One thing I didn’t talk about was the example Shenker used against Laundauer’s principle. This was mostly because I didn’t understand it well enough and reading Cris’s comments, I agree with him that this counterexample seems to have problems.

Toom's Rule, Thermodynamics, and Equilbrium of Histories

My recent post on thermodynamics and computation reminded me of a very nice article by Geoffrey Grinstein that I read a while back.
Suppose we are trying to store digital information into some macroscopic degree of freedom of some large system. Because we desire to store digital information, our system should have differing phases corresponding to the differing values of the information. For example, consider the Ising model in two or greater dimensions. In this case the macroscopic degree of freedom over which we wish to store our information is the total magentization. In order to store information, we desire that the magnetization come in, say, two phases, one corresponding to the system with positive total magnetization and the other corresponding to negative total magnetization.
Assume, now that the system is in thermal equilbrium. Suppose further that there are some other external variables for the system which you can adjust. For the example of the Ising model, one of these variables could be the applied external magnetic field. Since the system is in thermal equilbrium, each of the phases will have a free energy. Now, since we want our information to be stored in some sort of robust manner, we don’t want either of the phases to have a lower free energy, since if it did, the system would always revert to the phase with the lowest free energy and this would destroy our stored information. Since we require the free energy of all information storing phases to be equal, this means that we can always solve these equality equations for some of the external variables. This means that if we plot out the phases as a function of the external variables, we will always end up with coexisting phases along surfaces of dimension less than the number of external variables. For our example of the Ising model in an external magnetic field, what happens is that the two phases (positive and negative total magnetization) only coexist where the magnetic field equals zero. If you have any magnetic field in the positive magnetic direction, then the thermodynamical phase which exists in equibrium is the phase with the postivie total magnetization. So coexistence of phases, and in particular of information storing phases, in the external variable space, is always given by a surface of dimension less than the number of external variables
What is interesting, and why this gets connected with my previous post, is Toom’s rule. Toom’s rule is two dimensional cellular automata rule which exhibits some very interesting properties. Imgaine that you have a two dimensional square lattice of sites with classical spins (i.e. +1 and -1) on each of the lattice sites. Toom’s rule says that the next state of one of these spins is specified by the state of the spin, its neighbor to the north, and its neighbor to the east. The rule is that the new state is the majority vote of these three spins (i.e. if the site has spin +1, north has spin -1, and east has spin -1, the new state will be spin -1.)
Toom’s rule is interesting because it exhibits robustness to noise. Suppose that at each time step, the cellular automata instead of performing the correct update, with some probability the site gets randomized. What Toom was able to show was that for the Toom update rule, if this probability of noise is small enough, then if we start our system with a positive magnetization (just like the Ising model, we define this as the sum of all the spin values) then our system will remain with a postive magnetization and if we start our system with a negative magnetization it will similarly retain its magnetization. Thus Toom showed that the cellular automata can serve, like the two dimensional Ising model at zero applied field, as a robust memory.
But what is nice about Toom’s rule is that it gives an even stronger form of robustness. Remember I said that the noise model was to randomize a single site. Here I meant that the site is restored to the +1 state with 50% probability and the -1 state with 50% probability. But what if there is a bias in this restoration. From the Ising model point of view, this actually corresponds to applying an external magnetic field. And here is what is interesting: for Toom’s rule the region where the two phases which store information can coexist is not just at the (effectively) external magnetic field equal zero point, but instead is a region of external magnetic field between some positive and negative value (set by the probability of noise.) So it seems that Toom’s rule violates the laws of thermodynamics!
The solution to this problem is to realize that the probability distribution produced by Toom’s rule is not given by a thermodynamic Boltzman distribution! Toom’s rule is an example of a probabilistic cellular automata whose steady state is not described by classical thermodynamics. This is exactly one of the models I have in mind when arguing that I do not know whether the eventual state of the universe is going to be in Gibbs-Boltzman thermodynamic equibrium.
Along these lines, Charlie Bennett and Geoffrey Grinstein, have, however, shown that while the steady state of Toom’s rule is not given by a Gibbs-Boltzman thermodyanmic distribution, if one considers the histories of the state of the cellular automata, instead of the state itself, then Toom’s rule is given by a Boltzman distribution over the histories of the cellular automata. It’s at this point that my brain just sort of explodes. That a system’s histories are in equibrium is very strange: normally we think about equibria being generated in time, but here we’ve already used up our time variable! I suspect that the answer to this puzzle can be achieved by refering to the Jaynes approach to entropy, but I’ve never seen this done.

Back!

Posting has been nonexistent because I’ve been traveling. I’m now back from giving a lecture at the 4th biannual SQuInT student retreat, held at USC and organized by Todd Brun. I’ve been lucky enough to attend all four such retreats, once as a student, and three times as a lecturer (once as a graduate student, once as a postdoc, and once as whatever it is I am now 😉 .) This year I got the opportunity to lecture on quantum algorithms. I’ve put the contents of my slides online here.
The student retreat was a lot of fun, including a trip to the King Tut exhibit at LACMA. Unfortunately, my enjoyment was tempered by the nasty nasty cold I’ve come down with. I can’t hear anything out of my right ear. Bah!
SQuInT, by the way, stands for Soutwest Quantum Information and Technology Network. Yeah, someone must have been smoking something when they thought that one up 😉

FOCS 2005 Papers

The list of 2005 FOCS (46th Annual IEEE Symposium on Foundations of Computer Science ) accepted papers has been posted here. I see four quantum papers (out of 62, one to be announce). They are

“The Symmetric Group Defies Strong Fourier Sampling,” Cristopher Moore, Alexander Russell, and Leonard J. Schulman
“Cryptography in the Bounded Quantum-Storage Model,” Ivan Damgaard, Serge Fehr, Louis Salvail and Christian Schaffner
“Quantum Information and the PCP Theorem,” Ran Raz
“From optimal measurement to efficient quantum algorithms for the hidden subgroup problem over semidirect product groups,” Dave Bacon,Andrew M. Childs and Wim van Dam

I guess I can now officially call myself a “theoretical computer scientist.” Does this mean if someone says to me “Hermitian matrix” I cannot immediately follow with the words “diagonalize it”?

Thermodynamics is Tricky Business

Thermodynamics is one the most important tools we have for understanding the behavior of large physical systems. However, it is very important to realize when thermodynamics is applicable and when it is not applicable. For example, try to apply thermodynamics to the Intel processor inside the laptop I am writing this entry on. Certainly the silicon crystal is in thermal equilbrium, but then how am I able to make this system compute: if states are occupied with probabilities proportional to a Boltzman factor, then how can my computer operate with all sorts of internal states corresponding to, say, it’s memory? Let’s say that all of these internal states, states of my computing machine, are all energetically about the same energy (which is, to a decent approximation, true.) Then, according to thermodynamics, each of these states should be occupied with the same probability. But the last time I checked, the sentence I am typing is not white noise (some of you may object, 😉 )
Today, Robert Alicki, Daniel Lidar, and Paolo Zanardi have posted a paper in which they question the threshold theorem for fault-tolerant quantum computation and claim that the normal assumptions for this theorem cannot be fullfilled by physical systems. I have a lot of objections to the objections in this paper, but let me focus on one line of dispute.
The main argument put forth in this paper is that if we want to have a quantum computer with Markovian noise as is assumed in many (but not all) of the threshold calculations, then this leads to the following contradictions. If the system has fast gates, as required by the theorem, then the bath must be hot and this contradicts the condition that we need cold ancillas. If, on the other hand, the bath is cold, then the gates can be shown to be necessarily slow in order to get a valid Markovian approximation. Both of these conditions come from standard derivations of Markovian dynamics. The authors make the bold claim:

These conclusions are unavoidable if one accepts thermodyanmics…We take here the reasonable position that fault tolerant [quantum computing] cannot be in violation of thermodynamics.

Pretty strong words, no?
Well, reading the first paragraph of this post, you must surely know what my objection to this argument is going to be. Thermodyanmics is a very touchy subject and cannot and should not be applied adhoc to physical systems.
So lets imagine running the above argument through a quantum computer operating fault-tolerantly. Let’s say we do have a hot environment. We also have our quantum system, which we want to make behave like a quantum computer. Also we have cold ancilla qubits. Now what do we do when we are performing quantum error correction? We bring the cold ancillas into contact with the quantum computer interact the two and throw away the cold ancillas. Now we can ask the question, is the combined state of the cold anicllas and the hot environment in thermal equilbrium? Well, yes, both are in thermal equibrium before we start this process, but they will be in thermal equilbrium with two different temperatures. OK, so now we have an interaction between the system and the cold ancillas. So let’s do this. Now these two systems, the quantum computer and the ancillas clearly couple to the hot bath. Therefore we can assume that the Markovian assumption holds and further that the gate speed for the combined system-ancilla system is fast. No problem there, right. OK, now we throw away the cold ancillas. So we’ve done a cycle of the quantum error correction without violating the conditions set forth by the authors. How did we do this?
We did this by being careful about what we called the “system.” (Or, more directly we have to be careful what we call the “bath.” But really these are symmetric, no?) We started out the cycle with the system being the quantum computer. Then we brought in the cold ancillas. Our system now includes both the quantum computer and the ancillas. Since we are now enacting operations on this combined system, our enviornment is the original bath, which is hot (which may now couple to the ancillas.) We can perform fast gates on this combined system and then we may discard the ancillas.
In order for the authors argument to work, they have to assume that the “system” is always just the quantum computer. But then clearly the assumption of the environment being in thermal equilibrium is violated at the beginning of the error correcting cycle: the ancillas are cold but the bath is hot. Both are independently in thermal equibrium, but the combined system is not in thermal equilbrium at the same temperature. The interactions with the hot bath do imply that we can perform fast gates. The interactions with the cold ancilla do imply that we will have slow gates. But when we bring the cold ancillas and quantum computer together, we can also have fast gates: because our system now consists of the computer plus the ancillas and the remaining environment it hot. The ancillas are not part of the thermal bath which is causes problems for our quantum computer. Certainly the authors are not objecting to the fact that we can prepare cold ancilla states? So I see no contradiction in this paper with the threshold theorem. (A further note is that there is also a threshold for fault-tolerance when the noise is non-Markovian. I’m still trying to parse what the authors have to say about these theorems. I’m not sure I understand their arguments yet.)
Thermodynamics is, basically, a method for reasoning about large collections of physical systems when certain assumptions are made about this system. Often we cannot make these assumptions. (A classical case of this, which is not relevant to our discussion, but which is interesting is the case of the thermodynamics of a system of many point particles interacting via gravity: here thermodynamics can fail spectacularly, and indeed, things like the internal energy of the system are no long extensive quantities!) In the above argument, we cannot talk about two systems being at the same temperature: we have two separate systems with different tempatures. Certainly if we bring them together and they interact, under certain conditions, the two will equilibriate. But this is explicitly what doesn’t happen in the fault-tolerant constructions. This is, indeed, exactly what we mean by cold ancillas!
Understanding the limits of the threshold for fault-tolerant quantum computation is one of the most interesting areas of quantum information science. I’ve bashed my head up against the theorem many times trying to find a hole in it. I think that this process, of attempting to poke holes in the theorem, is extremely valuable. Because even if the theorem still holds, what we learn by bashing our heads against it is well worth the effort.
Updated Update: Daniel Lidar, Robert Alicki and other have posted responses and comments below. I highly recommend that you read them if you found this entry interesting!

More on Anyons in Honey

R.R. Tucci comments

Hey! Only 2 sentences devoted to a 100 page paper. Proust you are not. What I would like to hear from the audience are opinions on how far and in what sense does this paper advance the programme of topological quantum computing. And how close are Kitaev and Freedman to finding a physical realization of their ideas

So I will spend more than 2 sentences.
The model Kitaev considers is a model with two qubit interactions on a two dimensional honeycomb lattice. These two qubit interactions are all Ising interactions along different directions. Interestingly, Duan, Demler, and Lukin have shown how to implement these interactions in this geometry using an optical lattice. So, in some sense, this represents a very nice model with Abelian anyons which could be realized in a laboratory. What is even nicer is that Kitaev can actually exactly solve the Hamiltonian associated with this model. Unfortunately for quantum computing enthusiests, this model does not support universal quantum computation. So, to answer the question, I would say that this model is exciting because it might be physically realizable, but the fact that it is not universal means that we will need something a little or a lot more in order to move towards topological quantum computing.