“The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve.”
– E. P. Wigner
Our universe, or at least our understanding of the universe, appears to allow us to see its naked underbelly only through the use of mathematical reasoning. As Wigner says about this state of affairs, we neither understand nor deserve this. On the other hand, I’ve come to believe, this observation can also be a huge aid in describing the world of theoretical computer science. There is no doubt in most people’s opinion that theoretical computer science is mathematics of a highly sophisticated nature (or, well, sophisticated to this lowly physicist.) But theoretical computer science, unlike pure mathematics unfettered in its abstract glory, at its core must be concerned with the practical applicability of its reasoning. Here by practical I do not mean contributing to the better good of software development (though this may be important for the well being, read salary, of the practioners) but that at its core theoretical computer science must be about computers that exist in the real world. On the one hand, this observation leads direction to quantum computing. To paraphrase Feynman: the universe is quantum, damnit, so we better make our models of computing quantum. But it also should influence even our most basic classical computational models. In this vein, then, I would like to attack one of the most sacred holy cows of computer science, the holy mother cow of them all, the Turing machine.
The formalized description of the Turing machine is, no doubt, a major tool in codifying a language for the world of computer science. Recall that a Turing machine, in its most basic form, consists of a machine of finite memory with an ability to read and write symbols on an infinitely long tape made up of memory locations as well as the ability to move up or down these locations. The basic operating step for a Turing machine consists of examining the symbol currently under its read/write head, examining it’s internal state (of which there are only a finite number), and conditional on these two values, writing a new symbol on the current location and moving one direction on the tape. Such is the most basic model of a Turing machine. And there are many variations which are all “equivalent” to this model: Turing machines with multiple tapes, etc (wikipedia has a long list.)
Okay, so what’s my beef with the Turing Machine? Well by now many of you might think I’m going to harp on the infinite tape. But no, not really, as long as the tape is always “extendable” this seems like a physically fine model (though don’t try to keep everything in one place or your machine might collapse to a black hole.) What really bothers me about this model is that it is not, or at least I don’t think it is, fault-tolerant. That is, if we consider a world in which errors occur: errors by the machine, errors on the data sitting on the tape, errors in moving, etc, then the Turing machine is most explicitly not fault-tolerant. This is really, in a sense, an old observation: in order to make a computation tolerant to errors one needs parallelism.
Consider, for example, a very long computation, the Turing machine starts, say, on one end of the input and starts hacking away. But the data stored on the tape in a galaxy far far away, is just sitting there, and subject to the ever present whims of the decay of data. Noise rates for digital information are low, but they are not, and for physical reason probably can never be, zero. Thus if you are looking at a very large computation (and after all, theoretical computer science is all about asymptotic limits), then by the time the Turing machine needs to get out to the galaxy and process the information, the information out there will have been randomized and thus the computation will go bad. In order to make a truly fault-tolerant version of computing, it is necessary to have parallel computational power.
But wait, aren’t parallel Turing machines equivalent in power to Turing machines (for example, see here)? The so-called parallel computation thesis states that “Time on any ‘reasonable’ model of parallel computation is polynomially equivalent to sequential space.” But what is missed in such constructions is that while one can construct a sequential Turing machine simulation of a parallel machine, if the later is tolerant to errors, the former need not be. That is errors on the parallel machine, in simulation, appear as errors on the sequential machine which are not physical. The physical errors on the sequential machine will destroy the computation. Well at least that’s what I think is true, but IANACS (despite being in a computer science and engineering department), so does anyone have more insights into this? I’d love to be told this line of reasoning is wrong and all is well for traditionalan Turing machines.
Well okay, so because nature doesn’t seem to allow noise free computation, we shouldn’t consider the model of a sequential Turing machine as the basis of our theory of computer science. The model is, physically, an unattainable model. Instead what is needed is a model which allows for noisy computation and a high degree of parallelism. My claim is that this model can do things that a the traditional Turing machine with noise cannot do.
Okay, so well, perhaps you find this objection silly. “I’m a theoretical computer scientist, I can study whatever damn model I want.” True, true. I think that’s the first amendment of the computer science constitution. But then what you’re doing is mathematics, and not science: you’ve lost the connection to the real world. In the real world, Turing machines crap out, as far as I can tell, albeit at scales probably large enough to not have to worry about the Turing machine doing a good job capturing modern classical computing machines.
But even more importantly, consider this. Perhaps one of the reason why it is so difficult to prove major separations in computational complexity theory (think P versus NP) is that theory has strayed too far from how the real world operates. I call this the physicist ego principle: mathematics chugs and churns at its best when it is related to the physics of how the universe operates. A case in point would perhaps be Ed Witten’s work on understanding the Jones polynomial of knot theory in terms of the topological quantum field theory. This is, indeed, a kind of inverse claim to the quote by Wigner at the top of this post:
The miracle of the appropriateness of the language of physics for the formulation of the laws of theoretical computer science is a wonderful gift which we neither deserve nor understand (nor currently have a lot of evidence in favor of it, but lets just speculate.)
– the Quantum Pontiff
Okay, yeah, I’m off on the deep end here. But isn’t it kind of a beautiful dream?
Interesting. There has been some work on designing algorithms for problems with random memory corruption: of course if the program store itself can be corrupted then all bets are off.
There are a couple of ways of addressing what you’re talking about. The easiest is to posit that there’s some “protected core” that isn’t corrupted by noise, and let that do the parallel simulations. As long as you have a ‘safe processor’, then all kinds of noise models can be simulated.
You might then argue that this is unreasonable, in that there’s never a way to shield a processor from noise. The answer to this is to treat noise itself as an attack, with limits. Clearly some result of the kind “if the noise probability is p, then the machine executes correctly with probability f(p)” can be shown, yielding a perfectly decent randomized TM model (which is the “standard” classical model really).
But what I take exception to is the argument that separations have yet to be proved because we’re too divorced from the real world. A much more well-founded argument to me is that the reason separations haven’t yet been proved is because we don’t yet have the right “spaces” to describe classes of resource-bounded machines. Mulmuley’s work is all about finding the right “geometry” and representations to describe machines that compute with some bounded resources, and in fact he’s using some of the mathematics that eventually shows up in things like quantum field theory. That doesn’t mean that we “need physics”: at best you could argue that physics provides motivation to study certain kinds of mathematics which in turn has other pleasant consequences.
Thanks for the comment Suresh.
I agree that the later argument is one that should cause all to “take exception to!” I guess the thought I was trying to provoke is that there is a converse to the Wigner statement, and while many subscribe to Wigner’s view, we have trouble with the reverse statement. Indeed the idea that separations will just magically fall from redefining the model is silly. But sometimes silly ideas are fun to follow to their conclusions 🙂
Even most mathematicians I know would agree with your assessment about mathematics. I wrote an essay about mathematics and the use of language in science for the recent FQXi essay contest that dove into these ideas a bit. While the 20th century witnessed an exponential growth in our understanding of the universe this also, inadvertently, has led to an increase in what amounts to a sort of “mathematical philosophy” (I have another term for it that is probably not appropriate for this blog). We’re becoming enamored with the models and are forgetting about how well they correspond to reality.
For once I’ll write a short post … and it’s pure fan mail … Dave, this is IMHO a wonderfully thought-provoking post … & a sequel would be *very* welcome!
Antecedants of Wigner’s 1960 quote can be found in a delightful 1947 essay by von Neumann—that once was famous, but perhaps no longer—titled “The Mathematician”.
Gosh … maybe it’s time for a QIP-inspired *remake* of these two classics! 🙂
This discussion (so far) is reminiscent of Reinhard Selten’s (in)famous aphorism “Game theory is for proving theorems, not playing games.” Folks who want to see how real-world error handling works can consult Section 1.10: Single Failure Tolerance/ Redundancy of that fabulously interesting document, JPL Design, Verification/Validation and Operations Principles for Flight Systems.
Because when your spacecraft is orbiting Mars, Jupiter or Saturn … the error handling *really* needs to work. Unsurprisingly, this is achieved by a judicious mixture of rigorous theorems with real-world engineering design principles. 🙂
Dave: As you all-too-well know, I am not a mathematician or computer scientist so my rantings may seem idiotic to you.
But I don’t see how parallelism solves your data corruption problem. To wit, there is no reason why an infinite number of parallel tapes cannot all be corrupted. Hence to have a 100% fault tolerant Turing Machine as you envision, mustn’t you also have an infinity+1 number of parallel tapes….which brings you back to the problem of theory versus real world.
In the real world, there’s the backup plan, and when the backup plan fails, you have the backup-backup-plan. And when the backup-backup-plan fails, then you’re SOL. (See: http://www.urbandictionary.com/define.php?term=sol
Fortunately for those of us old enough to remember both the Blue Screen of Death http://en.wikipedia.org/wiki/Blue_Screen_of_Death and the movie, Jurassic Park, there was always the circuit breaker in the back of the room.
In the real world, there’s Six-Sigma http://en.wikipedia.org/wiki/Six_Sigma
and realistically building a system that has a much higher level of fault tolerance is going to be cost prohibitive, impossible, or both.
“Failure is always an option” in any real-world system.
By the way, since no one else took the bait, I’ll say it: nice Pink Floyd reference.
“…from the pale and downtrodden.”
> Ian, where was that ref? Search doesn’t find it on this page.
The title of this post – On the Turing Away – is, presumably (since Dave is a big Floyd fan) a play on the Pink Floyd song “On the Turning Away” from A Momentary Lapse of Reason (which, in my humble opinion as someone who is mostly a fan of pre-Dark Side Floyd, is an underrated album). If I have incorrectly presumed the inspiration for this post’s title, Dave can correct me.
Gary: thanks for the long response.
I definitely agree that we traditionally separate out the model of TM from the “engineering problem” of noise. But my point is that while we have _traditionally_ done this, a priori, when reasoning about computing machines in the real world, it’s not clear why we make this separation. I guess an analogy is Newtonian mechanics and General Relativity. Yes, you can work with the model of Newtonian mechanics in studying planetary motion, and it will give you all sorts of things you can prove about such systems, but there is a limit in which one needs to go to General Relativity to get the results that match the real world.
As to redundancy and parallelism….it’s certainly true that we use redundancy to deal with faults, but parallelism is also needed. A classic paper for this in quantum computing is http://www.cs.huji.ac.il/~doria/Papers/poly_sim_of_noisyQC_benor96.ps …. there are similar classical results, but I have to run to dinner and will search for the reference when I get back!
Neil, the way I’ve recently come to understand it (because I had always thought the “universes” spawned by a split could no longer interact – apparently I may be wrong) is that future evolution of the two paths could potentially recombine them briefly. In other words, the interference effects we get are actually a result of interfering universes. But I could be wrong.
As for your debate about mathematics, some colleagues and I from a variety of departments have had an ongoing discussion about this. By and large, everyone seems to agree that, on the one hand, mathematical models of the universe are simply that – models. They don’t represent reality since they are inherently imperfect. On the other hand, mathematics itself is viewed by these folks as being about as close as you can get to the Platonic ideal. As one mathematician put it, while physical theories regularly need to be corrected, as far as he knew, no mathematical theory has ever needed correction. Of course, my only trouble with that is that, since we are privy to mathematics, it must be part of this universe and, thus, part of reality in some way.
utopia27 (is that like Stalag 17?) –
Basically that spitball fight is the constructivist versus non-constructivist war. The Axiom of Choice is inherently non-constructivist.
Here’s another song for this thread! 🙂
Neil,
Yeah, it confused me too when I heard it since I thought they couldn’t interact.
Is the miracle of the appropriateness of the alphabet for the formulation of words something we don’t understand? Is it even a “miracle”?
Nature is not bound by the “laws” we create for it. Our “laws of physics” at any given time are the most accurate models we have at that time. These have changed in the past and I think it would be arrogant and foolish to think they will not change in the future. A common misconception is that these models build on each other so that future models will only consist of small corrections to our present models. This may be so in terms of predictions but that doesn’t imply the same is true of their assumptions. For example, the assumptions going into deriving Newton’s “laws” and Einstein’s “laws” are completely different – in fact they are logically contradictory – while at the same time they make similar predictions.
Scientists aren’t the only ones who come up with theories and explanations for their experiences. It just so happens that, because they are so complicated, our scientific theories needed more than a 26 letter alphabet and a little syntax.
Mathematics for science is no miracle – it’s just a bit of human ingenuity.
I wonder why you didn’t mention “distrubuted algorithms” and “models of concurrency” in this post. Yes, the sequential TM is insufficient when it comes to distributed algorithms (which studies all kinds of fault types, including malice).
There are several models of concurrency, but I don’t think there is a universal model that encompasses all of them. That would be my dream. My next dream would be a programming language based explicitly on such a model instead of just a TM.
Robin: indeed! One point I was trying to make is that using a model which is more in line with physics allows for the possible miracle of physics based reasoning. (that’s the final paragraph.) But to your point I guess it’s not clear, at least to me, that the kind of parallelism where you get fault-tolerance can be simulated efficiently on the noiseless TM model: in particular it seems to me that you need to allow a number of parallel machines which grows with the size of the input, and this would in turn require for simulation by a standard TM a modification of the TM model where, say, the state space or space resources grow with the size of the input. In short it’s not clear to me that the converse of your first sentences in your second paragraph applies (so the parallel model could be more powerful as well as being more realistic.)
Maybe this is a dumb question – why can’t the error handling on parallel tapes be done on a single sequential tape?
Robin: someone needs to use galactic scale parallelism in a paper 🙂
anandi and rest: yes a really really bad floyd reference!
Indeed the idea that separations will just magically fall from redefining the model is silly.
I disagree. We’ve seem closure results magically fall from redefining the model (http://arxiv.org/abs/quant-ph/0412187). Perhaps collapses are more likely to fall than separations, but it still seems worth investigating.
Ian, where was that ref? Search doesn’t find it on this page.
This is longer than I expected. But I can not find a shorter way without leaving critical links unlinked. Dave, feel free to edit. Fellow followers, skip if you feel I blather.
I’m not sure that it is productive to mix discussion of Turing Machine theory with discussion of correctness, as they are really different and orthogonal beasties.
TM is an abstract model of sequential computation constructed in order to make possible reasoning about what sort of problems can and can not be solved by sequential methods. Turned out so well that it also defined a complete set of operations and capabilities such that any machine that can be shown to be able to implement a TM can solve the same set of problems, and further that all such sequential machines are exactly equivalent to TMs ( i.e. Turing Complete) and can not solve problems that TM cannot. All current machines today are TC. The only departures are what Dave is working on – Quantum Computing – and some lab experiments with molecular computers.
Dealing with errors, OTOH, is a different discussion. I used the term “correctness” earlier. This breaks down into examining the several sources of error and methods for combating, preventing, or recovering from, them. All of which ultimately come down in practical terms to cost/benefit analyses of their likelihood and mitigation implementation costs, which is an engineering exercise. For it turns out that few of the sources for errors or incorrect behaviors are deterministic and decidable; most are a result of the fact that our Universe is a messy place, and so are our minds, and that is not susceptible to formalization.
The major sources of incorrectness of result are:
problem statement incorrect
Solution incorrect
implementation of solution incorrect
cosmic rays flips logic state in hardware (soft errors)
cosmic rays or ion migration cause logic circuits to fail (hard errors)
Control/Monitoring communications channels disrupted
Erroneous operating instructions sent to system
Power problems
Heat problems
Timing problems
Interference and EMI problems.
“Is your computer plugged into the wall socket” problems
etc etc etc
There are multiple mitigation methods for attacking each. And they all overlap. Like I said, messy.
I think the key word for error mitigation is redundancy, not parallelism. The latter is one way of achieving redundancy, but not the only way, and indeed when it comes to actually implementing the corrective action, not sufficient. For example, in data communications, the redundancy is achieved by serially repeating verbatim or encoded the data to be communicated over a single channel, usually only 1 bit wide (though more width is computationally equivalent and does not per se contribute to redundancy, only speed). Typically this would be via either hash signatures (aka CRC or cyclic redundancy codes) for detecting damaged blocks so they can be repeated, or by the more expensive use of forward error correcting codes, where enough of the data is repeated in the original message that broken bits can be reconstructed from what was received without the need for retransmission. The redundancy part is the sending of more than 1 bit per bit of payload, but this is not sufficient without the mechanism for recomputing the correctness of that extra part or the changes needed to the payload to recover correctness.
Similarly, when parallelism is employed for the redundancy, there is the extra layer of hardware that reconstructs the single result from the multiple copies. Which is in itself a possible source of error.
All of which for practical purposes comes back to cost: cost in time to design and deploy, cost in time to compute and correct, cost in hardware to compute and correct, and cost in extremities of error cases that can be handled vs those more catastrophic that can not by the chosen method and what they mean for the mission. Every situation is essentially different and warrant different mixes of the many ways of correcting errors. I would even state categorically that, unlike TM Computational Theory, a general theory of provable correctness for real world systems is an NP-complete problem and can not be handled algorithmically by any Turing-Complete system.
To recap: TM is theory of computation without regard for error resilience or completion, correctness/error recovery is an engineering problem, completion (is the problem decidable, will the program reach its Halting state, will it do so inside the time limits of the mission) is a problem description and solution design problem.
I’ve been in a running argument with “Bee” over at Backreaction about modeling “random processes” with mathematics. The point I’ve made (which is actually orthodox thinking in the foundations of mathematics) is that, a given mathematical process or algorithm etc. is deterministic. That is, it has to produce the same series etc. each time it is “run”. Think digits of pi, cube root of 23, etc. (A computer’s “random number generator” has to cheat by either re-seeding the generator with a hand-picked source, or by using outside noise.)
But “true quantum randomness” is supposed to mean, using even apparently identical things over might produce the sequence 100100110100111 one time and 1100001001111001010 … another trial, etc. That is even for “structureless particles” like muon decay. Hence, there is no way to put the same mathematical model inside muons to determine the actual decay times. (BTW, stochastic variables etc. are also a way of cheating, since you just say it produces the range of random results without given a way to produce specific examples.) So if the universe can be modeled completely with math as MUH implies, then apparent randomness would have to be clumsily contrived by in effect inserting a clockwork or pre-set algorithm into each muon etc. in a credible distribution. That sounds clunky, so things like that would have to be “genuinely random” and therefore not created by the equivalent of mathematical processes.
What about many-worlds? Well, I have plenty to gripe about but here’s one of my biggies: if possible measurements produce “splits” but there is nothing special about the process of measurement or measuring devices per se, then wouldn’t the first BS in a MZ interferometer instigate a “split” into two worlds? That is, one world in which the photon went the lower path, another world where it went the other path? But if that happened, then we wouldn’t see the required interference pattern in any world (or as ensemble) because future evolution would not recombine the separated paths at BS2. Reflect on that awhile, heh.
I must re-echo John Sidles’s comments – there are ultra-high-availability environments in which fault-tolerance has been designed in. These environments normally revolve around space and aviation, although nuclear reactor control comes to mind as well. The three parrallel flight control computers in the F-16 spring to mind…
These systems normally revolve around voting models, with computations checkpointed at regular junctures, and multiple parrallel computations contributing votes for the ‘right’ answer at any particular checkpoint.
I haven’t done the calculations in depth, but it’s pretty straightforward to build a proof of multiple side-by-side turing machines achieving any finite level of fault tolerance with a finite number of turing machines, and a finite degree of overhead for gathering/distributing vote results. If we keep all the finites in a single bucket, we wind up with a polynomial overhead.
I think it’s also pretty straightforward to demonstrate that eliminating the possibility of failure in a particular noisy computation environment is impossible. A pathological case where ALL the bits flip to ALL the wrong results in the SAME way at the SAME time is an admissable possibility – and one that can’t be detected.
So we’re left with fallible turing machines, which we can hitch together in tandem to get any (finite) desirable level of tolerance.
But me… I’m waiting for real quantum (or wet-ware, or DNA-synthesis, or..) computation to come in and break our definitions of NP-Hard. These are truly non-turing models of computation. Though it’s possible that the current crop of wet-ware neuron-emulators are turing-reducable..
One of the real advantages of theoretical computer science is that we know where the vulnerable bits of the really hard problems are. If we can build a quantum (or other non-turing) computer that provides the best-10 list of traveling salesmen tours in anything like a time polynomial in the size of the problem, then we need to redraw all the maps in computer science…
Neil, Ian –
I hate to break into a beautiful argument, but I have to disagree with Neil’s statement that (to paraphrase) mathematical processes must be deterministic, and the characterization that that’s a well-accepted position in the foundation of mathematics.
My experience has been that when probabilists run into foundational set theorists, they throw a lot of spitballs at each other. Most of those spitballs are of an “Axiom of Choice” shape. These confrontations tend to end indecisively, spawning the occasional newly-minted chaotician amidst the wrack and wreckage.
Foundational set theorists will tell you that everything in mathematics reduces to foundational axioms, and they’re expressed in terms of set theory. They’re right. No argument. What creates arguments is what’s a valid foundational axiom to build from. The Axiom of Choice is very disputatious because it says you can pick one-of-each from an infinite number of things, as long as you don’t care about how you select that one from the each. That’s where the random comes in.
Foundational set theorists are forever running after probabilists, trying to clean up the probabilists’ proofs, and reduce them to foundational axioms. It’s not bad work. Frequently, however, probabilists will invoke the Axiom of Choice to allow them to do something that looks pretty random. Then the set theorists will spend quite some time trying to re-express the proof without the Axiom of Choice. And it mostly works out – and looks much less random. When the re-expression doesn’t work out, the set theorist doesn’t generally publish, so we don’t know a lot about the non-results. So far (though I haven’t been looking) I haven’t seen any definitive demonstration that there are (or are not) useable, valid proofs that can not be constructed without the Axiom of Choice, or what such a set of proofs/demonstrations would look like as a class.
I bring up the occasional dazed chaotician, because they’ve got a unique perspective on determinism v. randomness. Chaoticians will tell you that everything’s deterministic, but that it matters not one whit, because you can’t actually distinguish the determinism from randomness in certain cases. (One of) The distinguishing features of chaotic regimes is an exquisite sensitivity to initial conditions. Take a rigidly defined, rigorously deterministic process, and run it out for a while. Record the results. Do it again. The second time, the outcomes will be different. Because the initial condition experienced in the second experiment was imperceptibly different than in the first one. And that’s a mathematician’s definition of “imperceptibly”, which is saying something.
So Neil – take your muon (and all the other muons – even mine), and install a chaotic recurring process in it that takes gravitational contour as an input. That little chaotic process will tick along precisely deterministically for all time. and it won’t matter a whit to you, because every iteration and every muon’s expressed decay will be slightly different, based on the ongoing expansion of the universe, because the gravitational contour is never _quite_ _exactly_ the same as it was a fraction of a second ago.
So it can be correct to say that the universe is a perfect clockwork of mathematical and deterministic processes, and also to say that to all appearances (ALL appearances) the universe contains processes whose outcome is indistinguishable from randomness.
Utopia27, Alan – the uncertainties in applications like axiom of choice apply to our knowledge of math and aren’t really ways to produce different sequences from the same “process.” You still can’t do that “from math” without “cheating.” As for interacting multi-worlds: that seems to overturn the whole original idea that they split and then could never contact each other. And then, what does “other” mean anymore? BTW I think I have an experimental procedure that would actually recover information apparently lost to decoherence, it’s a rough draft at name link but intelligible.
Dave,
Lots of comments and divergent discussions here, but I’d like to get back to your original post. I’m a little unclear on exactly what’s bothering you here.
I’ll happily grant that a noisy single-tape TM isn’t Turing-complete, i.e. can’t simulate a noiseless TM. But a parallel TM can simulate a noiseless TM. So the basic computational model that we call “Turing complete” is still valid, relevant, etc. If you believe that noise is fundamental (as von Neumann did), then you think of this model as the “noisy parallel Turing machine model.” But it still does all the same stuff. So what’s the issue?
I guess it seems like the existence of parallel fault tolerant machines is sufficient to justify all the TCS folks who only think about what perfect single-tape machines can do.
While the post itself and the comments make my eyes cross and my head hurt (I’m trying not to have the Barbie voice in my head saying ‘physics is hard’), I LOVED the post title/Pink Floyd reference. That was about the only thing I *did* catch in your post (and I did notice it before Ian pointed it out in the comments as well) 😀
Dave,
I see — the problem is that for fault tolerance you need really powerful parallelism. Which is sort of ironic, because although I didn’t realize that, my original draft comment contained a digression about how I didn’t like your tape-to-a-distant galaxy example ’cause it seemed to require… wait for it… parallelism that scales with tape size. So, yeah, okay, I agree now.
In other news, it’s possible to write comments from a Blackberry.
We can come at (what I take to be) Dave’s point from a different angle, by asking “What is the single most ubiquitously present, mathematically subtle, and physically important invariance of QM?”
The long-known (and obvious?) answer is Neilsen and Chuang’s Theorem 8.2: “Unitary freedom in the operator-sum representation”
This invariance is ubiquitous and physical, because it enforces classical causality. And the invariance is mathematically subtle, because it is associated with metric structure and stochastic Berezin functions, as contrasted with the (simpler) symplectic structures and (simpler) scalar Berezin one-forms, of Hamiltonian QM.
So why isn’t this invariance taught more widely? (And why doesn’t it have a common name?)
One reason has to do with Dave’s post: these elegant structures are culturally associated with noise … which is for many researchers the schmutz/untouchable/burakumin aspect of QM.
Thus I take the intuition behind Dave’s post to be that noise is (mathematically speaking) a 20th century ugly duckling of QM, that is evolving to become a 21st century swan.
A key question (IMHO) is this: can we pullback this invariance onto lower-dimension manifolds … in a way that is simultaneously consistent with causality, GR, and field theory? … and that does not result in what Bennet calls “computational extravagances.”
It is (IMHO) perhaps *not* a coincidence, that even to seriously try to answer this question, requires all the mathematical tools of string theory.
In summary, I too think that noise studies in QM may be an ugly duckling that is maturing to become a swan.
BTW, we should also think of the tragedy that befell Alan Turing on account of how the society and legal system of the time persecuted him for being homosexual. It is outrageous that such a brilliant mind was goaded by moralism into taking his own life. As for the Turing Test, isn’t it ironic that the machine would be considered equivalent to a human mind, if a human judge could not tell the difference. (Uh, how long is that allowed to take? – clear issue of sliding and debatable time and effort scales, as well as varying human competency etc. Not a very rigorous test!) But then, what if such a machine could in turn tell the difference between an equivalent to itself and a human, then what? That entity wouldn’t really be equivalent after all?
Bit of left over re randomness etc: infinite sets are special anyway, and issues of ambiguity about what should be in a set are not the same as being able to “produce” outcomes that are not always the same. See more at this discussion: http://backreaction.blogspot.com/2010/01/is-physics-cognitively-biased.html
as well as a contentious but illuminating (I hope) debate between me and others over the viability of the decoherence interpretation of quantum mechanics.
I have to admit that I don’t really understand what you’re getting at here. After reading the first paragraph I thought maybe you were going to discuss abstract models of quantum computing (which, by the way, would make a great post–hint, hint), but you seem to be unhappy about the fact that Turing machines might not be the best fit for modeling certain aspects of classical computing. Should they be? Turing created his machines for studying computability. It turns out they’re also pretty handy for studying computational complexity and a number of other topics as well. In fact, the ubiquity of Turing machines is a testament to the fact that Turing pretty much nailed it with his abstraction of what constitutes an algorithm. But that doesn’t mean that anyone should conclude that TMs are the only tool needed for analyzing computations. That notion reminds me of the complaints by Peter Wegner and Dina Goldin that TMs don’t model interaction.
Mathematical models are created for their usefulness in describing and analyzing a problem. TMs don’t quite work for the problem at hand? Maybe you can extend them by adding other features like probabilistic behavior or different tape structures. Maybe something that doesn’t look like a TM, like circuits or RAM models, more closely resembles the situation you’re modeling. Or maybe you need something more exotic, like the topological structures used in studying concurrency. My point is, I think you’re arguing against a straw man here. I’m not familiar with the study of fault-tolerance, but I’d be willing to bet that people who do this for a living aren’t using TMs on a daily basis. (On the other hand, if I randomly flip bits in the memory of my Windows PC, I stand a pretty good chance of getting a blue screen of death–and I think a TM pretty much nails that particular behavior.)
While I’m at it, I may as well add that I’ve never been impressed with that Wigner quote, either. Mathematics was created to model the world in which we live. Of course the language of mathematics is appropriate for formulating the laws of physics; that’s what it was made for! It would be totally stupid if that language of mathematics *wasn’t* appropriate for formulating our scientific laws. And if our universe was different, with laws of an unrecognizable nature, then our mathematics would be different in kind.
Kurt, exactly. The people who study fault tolerance do not use “just TM”s. If I remember correctly I/O automata dates back to late 80s.
As I said, this post falls under distributed algorithms/concurrency models in CS.
@23: I do not think it possible to create a fault-tolerant system with noiseless behavior. This is because, for any given fault-tolerant machine, there is a threshold of noisiness above which it is no longer equivalent to the noiseless simulation. Of course, I may have got your point backwards, although if so then while it is possible to simulate a fault tolerant machine via a single noiseless classical TM, you’ll never know if you got it right because the set of noisy inputs that would cause the machine to fail is infinite. And by their nature not inductively enumerable so you (in the general case) would not be able to even create an algorithm to enumerate them.
Re parallelism: I view this as one way to construct redundancy into the system so that fault detection and correction become possible. There are other ways, such as signing, hashing, distributing, etc, in the serial domain vs parallel. All of the above come into play in real world non-stop systems. But a) every such system can only defend against an enumerated set of noise conditions, and b) there will always exist a component of the system which will be susceptible to undetected single point of failure errors (in the error detection component itself). A bit Godelian. In the spirit of who shaves the barber, or who watches the watcher. The big design question is always: what are the likely failures we can guard against, how unlikely are the others, and can we afford to be exposed to them? And always, what we don’t know can kill us.
However, I think I am with you on a gut feeling level. I am envisioning a analog of a massively and fine-grained parallel system that might be visualized as an image being viewed by say a human. The image is robust to perturbations caused by any small number of pixels going bad, and if each pixel’s state is a weighted average of a large number of replicated computations behind it, such that the degree of infidelity in the pixel state is minimally perturbed by noise failure of any few of those machines? IOW, if you take the parallelism and replication far enough and fine enough you get an inherent robustness. Close? Like, maybe there’s a Boyle’s Law of fault tolerance?
A very lightly baked idea. But good gut with it, if not perfect words.
@30 and antecedents: Noise. Annealing. 5 – 10% chaos is a Good Thing ™. More input, more input …
I just ran across a good statement of this problem in a paper from Pippenger: http://arxiv.org/abs/0709.0967
“The theory of fault-tolerant computation has a history almost as old as that of fault-tolerant communication. The most widely used theoretical model for computation, the Turing machine, is unsuitable for the study of fault-tolerant computation: it calls for leaving large amounts data unattended on tapes for long periods of time; it seems unrealistic to assume that this data will not be corrupted by failures (at a positive constant rate per tape cell and per time step), but for a Turing machine (which can perform just one basic action per time step) there is no hope of keeping up with the failures that occur in the absence of such an assumption. ”