Over at Roseblog, Geordie gets on a high horse and tells us quantum theorists what we haven’t read enough because we haven’t read papers like this. Which, I guess makes me not a quantum theorist! No, people like me, we never read outside of our narrow focus or try to learn as much about a field we work in as possible. We’ve never heard of this Aeppli guy either….what even Scott Aaronson, the quantum theorist of theorists, sites cites his work? That’s just impossible!
Oh, and by the way, the paper sited shows a speedup of a factor of thirty with their “quantum annealing” strategy over their “classical annealing” strategy. See substantial comments below by Bill Kaminsky for why the above sentence is wrong.
And yes, I’m grumpy.
Actually the speed-up they quote is a factor of 10^10 for the specific example dealt with near the end of the paper (which is close to what a “real” problem of this sort would look like). Where did you get 30? Did you mean 2^30?
Colors blind; Sound deafens; Beauty beguiles; the enemy of stillness is desire. Eliminate desire, and the truth will become clear.
“Sited”? As in “cited”?
Ahh yeah I was thinking of the theory paper, you’re right. Although even a factor of 30 is pretty nifty..if we could cut calculations that take 12 hours down to 24 minutes that would be pretty awesome.
“…quantum cooling has produced a state for which the relaxation times are a factor of 30 faster than those produced by classical cooling”
Gentlemen, gentlemen, you’re all wrong. 🙂
*** Short Version ***
In this 1999 Science paper by Brooke, Bitko, Rosenbaum and Aeppli about which we’re arguing:
1) The quoted “relaxation times” do not directly refer to the runtimes of thermal (“classical”) vs. transverse-field (“quantum”) annealing algorithms.
2) The key bit of potential encouragement in regard to quantum speedups over classical annealing is the following. Their data is consistent with the idea that their 20 hours of quantum annealing produced the ground state of their LiHo_{44%}Y_{56%}F_4 spin glass sample. Such a feat is something that no practical period of classical annealing (or, indeed, any known classical algorithm) could ever do. Of course, their data’s also consistent with the much less sweeping interpretation that quantum annealing merely reaches a qualitatively different bunch of excitations over the ground state than classical annealing does. Even this less sweeping interpretation is quite encouraging, or at least intriguing. 🙂
*** Longer Version ***
Again, the first thing to note is that the quoted “relaxation times” in this 1999 Science paper do not directly refer to the runtimes of thermal (“classical”) vs. transverse-field (“quantum”) annealing algorithms.
I say this because:
(a) Brooke, et al. do not know the true ground state energy of their system,
(b) they cannot measure the energy of the final state attained in their thermal and transverse-field annealing protocols (and thus see how close in a given time they’ve come to the ground state), and
(c) as such, they do not run experiments where they vary the running times of their annealing protocols and see how long it takes to get to within a particular accuracy of the ground state.
Instead, what they do is perform classical and quantum annealing schemes, each apparently fixed to take 20 hours (see Fig. 2), and then measure on the resulting state the longitudinal magnetization response to a small, uniformly applied, longitudinal magnetic field with sinusoidal time dependence between 1 Hz and 10^5 Hz (the “AC susceptibility”). The “relaxation time” quoted is the reciprocal of the frequency at which the peak of the imaginary part of the susceptibility occurs (see the bottom of the first column on the last page of the paper).
In light of this, what the heck does it mean that Brooke, et al. found that the states produced by their transverse-field annealing had relaxation times “30 times greater” than the states produced by thermal annealing?
Well, we can’t entirely be sure because we don’t fully understand spin glasses, but Brooke et al. infer that it means that 20 hours of transverse-field annealing has produced something that could essentially be the ground state of the system. This in fact is a quite stunning inference as one expects that no reasonable number of hours of thermal annealing (let alone the 20 of thermal annealing Brooke, et al. do) can produce the ground state.
Their reasoning is as follows:
By definition, the spin glass phase is associated with the system being pitted with many, many metastable states. The most cautious reading of the “30 times greater” relaxation times observed for the transverse-field annealed states is that they correspond to local energy minima at the bottom of steeper energy wells than the metastable states reached by classical annealing.
However, there’s grounds for believing that something more is going on (and thus the “stunning inference” I mentioned above). These grounds are depicted in Fig. 5 of the paper. The data in this figure come from Brooke et al slightly changing where they end their annealing. In the case of thermal annealing, these changes yield notably different susceptibilities, and this corresponds to one of the most famous spin glass phenomena: there’s so many metastable minima and so many that have noticeably different properties, that there’s a very pronounced “history dependence” in that slightly changing how one prepares the system can cause it to exhibit substantially different magnetic properties. In contrast, in the case of transverse-field annealing, all these slightly different endpoints of the annealing scheme yield the same low-frequency (i.e., less than 100 Hz) susceptibilities. As Brooke, et al. say at the top of the second column of their last page:
Natural, yes. Definite, no. A more cautious interpretation is that transverse-field annealing reaches a qualitatively different bunch of excitations over the ground state than thermal annealing does. This nonetheless is quite encouraging (or at least quite intriguing if you’re sure that somehow all hopes of major speedups are foolhardy).
———-
P.S. I’m curious as to what you mean, Geordie, by “the theory paper”. If it’s this 2002 Science paper, then it should be noted that it too is a mixed bag (though, again, basically encouraging/intriguing). While there, the authors (Santoro, et al.) do mention that some numerical quantum Monte Carlo data is consistent with the notion of an exponential speedup of transverse-field annealing over classical annealing, the theoretical model they present implies only a polynomial speedup. (Interestingly, this polynomial speedup is beyond the generic Grover quadratic speedup. Their model implies a sorta generic cubic to sixth-order speedup over classical annealing. Of course, this is somewhat comparing apples to oranges, as the Grover speedup is over any classical algorithm for unsorted search. In contrast, the Santoro et al. result is a typical-case result for a class of random minimization problems and is a comparison to a specific classical algorithm, namely thermal annealing.)
Joe: In superconducting architectures you can get a pre-factor speed-up from the substrate change over modern semiconductor electronics. And the “30” they quote in the experiment can’t be used to make judgments about scaling for a variety of reasons.
All I’m saying is that something that is easily dismissed in theory (just a factor of 30) would be transformative in many computing contexts and worth billions of dollars. Large corporations make billion dollar bets on factors of 2 all the time.
And it is not true that you can easily get factors of 30 performance speed-up just by switching hardware. If this were possible it would be quickly done. In actuality even making a SINGLE doubling costs billions of dollars.
Take for example the Cell chip. This processor cost in excess of $2B to get running and its performance is MUCH less than a factor of 30 better than opterons etc.
In the case of QA vs. CA the best theoretical description of these systems show scaling advantages, not pre-factor speed-ups.
Bill: Yes that was the one I was referring to, and in particular the case near the end where they show what the /zeta=2 -> 6 means in practice (additional effort to get an extra 10x reduction in residual energy.
Unfortunately it doesn’t really work like that. A constant speedup is unimportant, as you can expect constant speedups when switching between different hardware.
Most quantum computer architectures have gating times which are much longer than the time taken to perform an operation within a modern CPU, so a 30x speedup gets swamped by the overhead of doing quantum computing in the first place (and it’s much worse if error correction is necessary).
The utility of a constant speedup from a quantum algorithm is questionable, since we have no reason to believe that classical and quantum gate times will reduce at the same rate.
Scott: I can’t help but think you haven’t thought this through very carefully. If someone could build a black box solver that could provide a cubic scaling advantage over simulated annealing, that by iself would be transformative. As you’re aware, simulated annealing is used in many contexts.
The theory paper I and Bill linked to provides an example that shows a factor of 10^10+ speed-up for a typical annealing problem using QA vs. SA.
What I can’t get my head around is why you’re constantly on about “overhyping QC” in the light of this type of result. Is it because you don’t think 10^10 makes a difference? Or you don’t believe the result? Or anything that doesn’t change a problem’s complexity class isn’t interesting and therefore isn’t worthy of press?
Speaking of irony, it looks as if you are guilty of the mental glitch you describe in your latest blog post, although you seem to have reconciled two logically incompatible points of view quite nicely and no-one (probably not even you) seems to have noticed.
I did indeed cite Aeppli et al., as giving the best evidence so far that states with superpolynomial tree size can exist in nature. But the irony goes further than that: when I interviewed at U. of Chicago, I had an excellent chat with Tom Rosenbaum, one of the condensed-matter physicists who wrote the papers in question and now the U of C’s provost. One of the main things we talked about was how to fight back against irresponsible quantum computing hype.
Bill, I freely admit that I have not yet read the papers mentioned. My comments were solely aimed at the question of whether or not a constant prefactor speedup is in anyway meaningful.
Geordie: Obviously for any given problem there is a best known architecture, and switching to other architectures/implimentations will not give you a speedup. However, it is worth pointing out that specialized hardware will always be better suited to dealing with the problems for which it was designed than more general purpose processors. The Cell chip is actually a good example. You can’t directly compare it to an opteron by using a single number. The cell chip has been built with vector manipulation in mind, and so excels at tasks requiring this type of calculation, but does less well at other tasks.
Another, perhaps better example, is Folding@home, which can be run on a PC’s cpu, or using a special client on many highend graphics cards.
http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats
There are 200 times as many windows clients than GPU clients, but they barely manage to do three times the work of the GPUs. That’s over a factor of 60 in the difference.
Another reason that prefactors aren’t as clear cut as you seem to suggest is that there is far more far cheaper classical computational power available (for the forseeable future at least) than quantum computational power. Obviously I have no idea how DWave plan to price their future products, but given than I can put together a powerful cluster node for about $500 (maybe a bit more if I want to juice up vector processing by adding fast GPUs) then for $1 million, I could build an extremely powerful resource. Obviously if I by an off the shelf solution, I’ll get a lot less, but even so…
Given how extremely difficult it is to build a working quantum computer, constant prefactors are unlikely to make then competitive with classical computers.
Obviously if there is some polynomial advantage then there will come a point when QC is the way to go, but with constant prefactors this is simply not the case.
A problem, for which QIP offers a factor of 30 improvement, that takes 12 hours on Blue Gene L (or it’s future equivalent) will not be solved on a contempory quantum computer in 24 minutes.
Thanks for the substantial post Bill Kaminsky. (Though isn’t what I said was correct just incomplete? 🙂 )
Doesn’t that 2002 Science theory paper only show evidence of changes in the quantum vs. classical annealing for a 2D Ising model without a longitudal field (which is in P, L. Bieche, R. Maynard, R. Rammal, and J.P. Uhry. “On the ground states of the frustration model of a spin glass by a matching method of graph theory.” J. Phys. A, 13:2553–2576, 1980.)?
To your questions, Dave:
I’m afraid I must object strenuously to the statement “Oh, and by the way, the paper sited shows a speedup of a factor of thirty with their ‘quantum annealing’ strategy over their ‘classical annealing’ strategy.” This is because the quoted relaxation time for a state has nothing to do with the time it took to prepare the state by annealing. Rather, it has to do with the amount of time it’ll take a small perturbation to the state to dissipate away.
The importance of these relaxation times and the factor of 30 difference between the 20-hours-of-quantum-annealing-case and the 20-hours-of-classical-annealing-case is merely to establish that these cases were substantially different. Indeed, this probably isn’t even the most important difference that Brooke et al. observed. The most important difference is probably the fact that the quantum annealing seemed to “converge” (i.e., slight differences in the endpoint of the annealing seem to yield basically the same state, at least as measured by the sub-100 Hz longitudinal susceptibility) whereas the classical annealing did not “converge.”
As I hoped to explain clearly in my comment above, if one then interprets these observations in light of the usual notion that for any reasonable length of time the classical, thermal spin glass transition involves freezing out of equilibrium into some random “history-dependent” subset of metastable excited states that have substantially different magnetic properties from one another, then the fact that the quantum annealing apparently converged and the classical annealing didn’t would seem to imply that quantum annealing reached some region of state space that classical annealing never can reach in any reasonable length of time. In such a picture, there’s a speedup much, much greater than a factor of 30 (though once again, at the risk of beating a dead horse, the number 30’s a red herring since the data from which it comes don’t directly measure what what we want to quantify).
The numerical quantum Monte Carlo simulations in the 2002 Santoro et al. Science paper were indeed for this case. As they themselves note, there is a classical polynomial time algorithm for finding the ground state(s) in this 2D-without-longitudinal-field case.
However, the theoretical model they present depends only on a semiclassical tunneling estimate and the density of states near the ground state of problem Hamiltonian (and this latter quantity only in order to make a very coarse estimate of how many low-lying excitations there are). As such, I believe it holds beyond 2D. Admittedly, this can make one suspicious as it’d imply that quantum annealing is *NOT* as efficient as the best classical algorithms for the 2D case, yet somehow is subtantially more efficient than any known classical algorithm in 3D or higher.
That said, there is something that worries me considerably about the Santoro et al. result. The excitations treated by their method are most pronounced as the transverse field vanishes. But the whole reason they have small gaps in this region is that they chose the popular-with-condensed-matter-theorists-but-silly-for-computer-scientists model of a spin glass having a *continuous* distribution of coupling strengths. The region where one fundamentally has to worry about small gaps no matter what problem you solve is the region around the critical transverse field (i.e., where you shift from being dominated by the starting Hamiltonian to being dominated by the problem Hamiltonian). Now, given that Santoro et al.’s method is basically just a coarse bound on how much excitation a cascade of Landau-Zener transitions through all the low-lying exitations will entail, it should apply to the critical region too. But caution is necessary! If one’s dealing with a discrete distribution of coupling strengths, then becoming excited by a vaninshingly small amount of energy in the critical region will yield an excitation at the end of the adiabatic interpolation of some fixed constant size. While such an excitation could well be the lowest-lying excitation possible, it should be remembered that the low-lying excitations in frustrated systems generally look a lot different than the actual ground states (say, in Hamming distance).
Thanks Bill, I’ve scratched out the sentence and added a note to read your comments!
Another thing from the Brooke et al. paper which struck me was that they are ending up in a model with a finite transverse magnetic field…so this is really solving a quantum ground state problem, right? So this is comparing a classical annealing on a quantum model to a quantum annealing on a quantum model?
I agree with your assessment. I don’t know why they do it. My guess is that it may have to do with the following. Their final temperature is limited to be 0.030 Kelvin, yielding k_B T = 4.2 x 10^{-25} Joules. Their final transverse field is 7.2 x 10^3 Oersted, yielding \mu_{Bohr} H = 6.7 x 10^{-24} Joules. Maybe in the quantum annealing case they wanted the terms in the quantum Hamiltonian to be clearly stronger than k_B T?
Nevertheless, I’m pretty confident in the intuition that the main action lies in how one enters the glassy phase, whether by a classical route or a quantum route. So it’s justifiable to be encouraged/intrigued as to whether quantum annealing reaches states that classical annealing can’t reach in any reasonable length of time.
But while on this note of how the Brooke et al setup is not a clean, ready-made-by-Mother-Nature analogue of the adiabatic quantum computers we hope to build someday, it’s also worth noting that their classical annealing route for LiHo_{44%}Y_{56%}F_4 undergoes *two* phase transitions—first paramagnetic to ferromagnetic and then ferromagnetic to spin glass—wherease the quantum annealing route undergoes only one, a paramagnetic to spin glass. See their Figure 1.
Geordie, I don’t know what you’re responding to. I’ve never denied that polynomial quantum speedups are both profoundly interesting and (hopefully!) achievable. Furthermore, there are wonderful open questions what kinds of speedups quantum annealing could yield compared to classical annealing on “real-world” optimization problems, and I hope people will continue to investigate those questions empirically. (Of course, if someone claims to be observing a quantum speedup, then naturally I’m going to want evidence. And if that person not only doesn’t provide evidence, but rejects the very idea of doing so as “academic,” then naturally I’m going to be skeptical. But those are just obvious ground rules.)
Anyway, I just looked at the Santoro et al. paper. I have two questions; I’m hoping that those more knowledgeable than I am (Bill? Dave?) will be kind enough to enlighten me.
(1) Santoro et al. study a random 2D Ising spin glass problem, which (as they point out) is solvable in classical polynomial time by methods other than simulated annealing. Is there any experience indicating that results for 2D Ising problems are likely to carry over to the problems we care about more, like (say) random 3SAT?
(2) The theoretical analysis is based on Landau-Zener theory. How good is the evidence that this theory gives accurate predictions in the “computer-science regime”? (I think it’s fair to say that semi-rigorous “physics arguments” have sometimes worked spectacularly and sometimes failed spectacularly when applied to combinatorial optimization problems. So, what are the reasons to think we’re in one case and not the other?)
Thanks in advance!
To Scott:
To answer your questions…
The short answer is that that Santoro does have a recent review article which covers his work since 2002 and this includes work on random 3-SAT. This work is available here (alas, J Phys A subscription required, and seemingly not on arXiv… by the way, the random 3-SAT stuff is reviewed on page R418, i.e., page 26 of the article)
Notably, this random 3-SAT work—which is all numerical (as best as I can tell, I’m basing this just on the aforementioned review article)—is less encouraging in the following specific sense than his 2002 work in Science on a 2D Ising Spin Glass with continuous coupling distribution. In the 2D case, even *simulating* the usual linear-interpolation adiabatic QC with a classical algorithm (“path integral Monte Carlo”) appears to yield something better than classical simulated annealing. In the random 3-SAT case (at least on the one instance studied which is near the SAT-UNSAT transition and is solvable by survey propagation), good ol’ classical simulated annealing outperforms path integral Monte Carlo simulations of linear-interpolation adiabatic QC.
I’d say this corroborates my concerns in my last comment. Namely, the theoretical model Santoro et al. present in their 2002 Science paper really is just a coarse bound on how fast amplitude leaks out of a state surrounded by lots of other states by Landau-Zener tunneling. This makes sense in the 2D spin glass model with a continuous coupling distribution since there you always do have basically a band of closely spaced states around the ground state. In contrast, for random 3-SAT or any other problem with unit clause penalties, you don’t. Thus, what you’re really concerned with is excitations in the critical region where the true minimum gap occurs. Excitations there then will get adiabatically continued to problem Hamiltonian states that are at least 1 energy unit away from the ground state.
This is a nice segue to your next question…
Well, we physicists often don’t make it clear to outside observers, but some crazy things we do like the replica theory expressions that have turned out to give excellent answers for, say, the SAT-UNSAT phase transition in random k-SAT, are in fact “exact”. That is, we’re making no approximations whatsoever in the equations we write down to be solved. It’s just that we don’t know how to solve these equations without making some Ansatz—usually it’s based on some physical notion, e.g., it should be a stable free energy minimum—and then just blithely assuming that if it checks out as self-consistent, then it must be the *unique* solution. [Oooh, I love how mathematicians cringe at that one. 🙂 But really, there’s a lot to be said for finding that a physically reasonable Ansatz is indeed consistent with an exact equation.]
However, the Landau-Zener arguments for quantum annealing and the barrier scaling arguments for classical annealing are nothing like that. They’re just rough estimates (though note, conservative estimates I’d wager) of quantum tunneling and thermal hopping. I think it’s a decent bet that if these methods show a polynomial speedup, then there is a polynomial speedup. But I wouldn’t put huge stock in the the actual order of the speedup produced. And, as I’ve argued above, they’re more natural for some regimes than others.