Scientific Birthdays –Toffoli gate honored on namesake's 70'th

P1030110aTommaso Toffoli’s 3-input 3-output logic gate, central to the theory of reversible and quantum computing, recently featured on a custom cake made for his 70’th birthday.
Nowadays scientists’ birthday celebrations often take the form of informal mini-conferences, festschrifts without the schrift.  I have had the honor of attending several this year, including ones for John Preskill’s and Wojciech Zurek’s 60’th birthdays and a joint 80’th birthday conference for Myriam Sarachik and Daniel Greenberger,  both physics professors at City College of New York.  At that party I learned that Greenberger and Sarachick have known each other since high school.  Neither has any immediate plans for retirement.
Greenberger Sarachik 80th birthday symposium
 

Resolution of Toom's rule paradox

A few days ago our Ghost Pontiff Dave Bacon wondered how Toom’s noisy but highly fault-tolerant 2-state classical cellular automaton  can get away with violating the Gibbs phase rule, according to which a finite-dimensional locally interacting system, at generic points in its phase diagram, can have only only one thermodynamically stable phase.  The Gibbs rule is well illustrated by the low-temperature ferromagnetic phases of the classical Ising model in two or more dimensions:  both phases are stable at zero magnetic field, but an arbitrarily small field breaks the degeneracy between their free energies, making one phase metastable with respect to nucleation and growth of islands of the other.  In the Toom model, by contrast, the two analogous phases are absolutely stable over a finite area of the phase diagram, despite biased noise that would seem to favor one phase over the other.  Of course Toom’s rule is not microscopically reversible,  so it is not bound by laws of equilibrium thermodynamics.

Nevertheless, as Dave points out, the distribution of histories of any locally interacting d-dimensional system, whether microscopically reversible or not, can be viewed  as an equilibrium Gibbs distribution of a d+1 dimensional system, whose local Hamiltonian is chosen so that the d dimensional system’s transition probabilities are given by Boltzmann exponentials of interaction energies between consecutive time slices.  So it might seem, looking at it from the d+1 dimensional viewpoint, that the Toom model ought to obey the Gibbs phase rule too.
The resolution of this paradox, described in my 1985 paper with Geoff Grinstein,  lies in the fact that the d to d+1 dimensional mapping is not surjective.  Rather it is subject to the normalization constraint that for every configuration X(t) at time t, the sum over configurations X(t+1) at time t+1 of transition probabilities P(X(t+1)|X(t)) is exactly 1.    This in turn forces the d+1 dimensional free energy to be identically zero, regardless of how the d dimensional system’s transition probabilities are varied.  The Toom model is able to evade the Gibbs phase rule because

  • being irreversible, its d dimensional free energy is ill-defined, and
  • the normalization constraint allows two phases to have exactly equal  d+1 dimensional free energy despite noise locally favoring one phase or the other.

Just outside the Toom model’s bistable region is a region of metastability (roughly within the dashed lines in the above phase diagram) which can be given an interesting interpretation in terms of the  d+1 dimensional free energy.  According to this interpretation, a metastable phase’s free energy is no longer zero, but rather -ln(1-Γ)≈Γ, where Γ is the nucleation rate for transitions leading out of the metastable phase.  This reflects the fact that the transition probabilities no longer sum to one, if one excludes transitions causing breakdown of the metastable phase.  Such transitions, whether the underlying d-dimensional model is reversible (e.g. Ising) or not (e.g. Toom), involve critical fluctuations forming an island of the favored phase just big enough to avoid being collapsed by surface tension.  Such critical fluctuations occur at a rate
Γ≈ exp(-const/s^(d-1))
where s>0 is the distance in parameter space from the bistable region (or in the Ising example, the bistable line).  This expression, from classical homogeneous nucleation theory, makes the d+1 dimensional free energy a smooth but non-analytic function of s, identically zero wherever a phase is stable, but lifting off very smoothly from zero as one enters the region of metastability.
 

 
Below, we compare  the d and d+1 dimensional free energies of the Ising model with the d+1 dimensional free energy of the Toom model on sections through the bistable line or region of the phase diagram.

We have been speaking so far only of classical models.  In the world of quantum phase transitions another kind of d to d+1 dimensional mapping is much more familiar, the quantum Monte Carlo method, nicely described in these lecture notes, whereby a d dimensional zero-temperature quantum system is mapped to a d+1 dimensional finite-temperature classical Monte Carlo problem.   Here the extra dimension, representing imaginary time, is used to perform a path integral, and unlike the classical-to-classical mapping considered above, the mapping is bijective, so that features of the d+1 dimensional classical system can be directly identified with corresponding ones of the d dimensional quantum one.
 
 

A Paradox of Toom's Rule?

Science is slow.  You can do things like continue a conversation with yourself (and a few commenters) that started in 2005.  Which is what I’m now going to do 🙂  The below is probably a trivial observation for one of the cardinals, but I find it kind of interesting.
Let’s begin by recalling the setup.  Toom’s rule is a cellular automata rule for a two dimensional cellular automata on a square grid.  Put +1 and -1’s on the vertices of a square grid, and then use the following update rule at each step: “Update the value with the majority vote of your own state, the state of your neighbor to the north, and the state of your neighbor to the east.”  A few steps of the rule are shown here with +1 as white and -1 as blue:
Toom's RuleAs you can see Toom’s rule “shrinks” islands of “different” states (taking away such different cells from the north and east sides of such an island.)  It is this property which gives Toom’s rule some cool properties in the presence of noise.
So now consider Toom’s rule, but with noise.  Replace Toom’s update rule with the rule followed by, for each and every cell a noise process.  For example this noise could be to put the cell into state +1 with p percent probability and -1 with q percent probability.  Suppose now you are trying to store information in the cellular automata.  You start out at time zero, say, in the all +1 state.  Then let Toom’s rule with noise run.  If p=q and these values are below a threshold, then if you start in the +1 state you will remain in a state with majority +1 with a probability that goes to one exponentially as a function of the system size.  Similarly if you start in -1.  The cool thing about Toom’s rule is that this works not just for p=q, but also for some values of p not equal to q (See here for a picture of the phase diagram.)  That is there are two stable states in this model, even for biased noise.
Contrast Toom’s rule with a two dimensional Ising model which is in the process of equilibriating to temperature T.  If this model has no external field applied, then like Toom’s rule there is a phase where the mostly +1 and the mostly -1 states are stable and coexist.  These are from zero temperature (no dynamics) to a threshold temperature T, the critical temperature of the Ising model. But, unlike in Toom’s rule, if you now add an external field, which corresponds to a dynamics where there is now a greater probability of flipping the cell values to a particular value (p not equal to q above), then the Ising model no longer has two stable phases.
In fact there is a general argument that if you look at a phase diagram as a function of a bunch of parameters (say temperature and applied magnetic field strength in this case), then the places where two stable regimes can coexist has to be a surface with one less dimension than your parameter space.  This is known as Gibbs’ phase rule.  Toom’s rule violates this.  It’s an example of a nonequilibrium system.
So here is what is puzzling me.  Consider a three dimensional cubic lattice with +1,-1 spins on its vertices. Define an energy function that is a sum over terms that act on the spins on locations (i,j,k), (i+1,j,k), (i,j+1,k), (i,j,k+1) such that E = 0 if the spin at (i,j,k+1) is in the correct state for Toom’s rule applied to spins (i,j,k), (i+1,j,k), and (i,j+1,k) and is J otherwise.  In other words the terms enforce that the ground state locally obey’s Toom’s rule, if we imagine rolling out Toom’s rule into the time dimension (here the z direction). At zero temperature, the ground state of this system will be two-fold degenerate corresponding to the all +1 and all -1 state.  At finite temperature this model well behave as a symmetric noise Toom’s rule model (see here for why.)  So even at finite temperature this will preserve information, like the d>2 Ising model and Toom’s CA rule.
But since this behaves like Toom’s rule, it seems to me that if you add an external field, then this system is in a bit of paradox.  On the one hand, we know from Gibb’s phase rule, that this should not be able to exhibit two stable phases over a range of external fields.  On the other hand, this thing is just Toom’s rule, laid out spatially.  So it would seem that one could apply the arguments about why Toom’s rule is robust at finite field.  But these contradict each other.  So which is it?
 

4 Pages

Walk up to a physicist at a party (we could add a conditional about the amount of beer consumed by the physicist at this point, but that would be redundant, it is a party after all), and say to him or her “4 pages.”  I’ll bet you that 99 percent of the time the physicist’s immediate response will be the three words “Physical Review Letters.”  PRL, a journal of the American Physical Society, is one of the top journals to publish in as a physicist, signaling to the mating masses whether you are OK and qualified to be hired as faculty at (insert your college name here).  I jest!  (As an aside, am I the only one who reads what APS stands for and wonders why I have to see the doctor to try out for high school tennis?)  In my past life, before I passed away as Pontiff, I was quite proud of the PRLs I’d been lucky enough to have helped with, including one that has some cool integrals, and another that welcomes my niece into the world.
Wait, wht?!?  Yes, in “Coherence-Preserving Quantum Bits” the acknowledgement include a reference to my brother’s newborn daughter.  Certainly I know of no other paper where such acknowledgements to a beloved family member is given.  The other interesting bit about that paper is that we (okay probably you can mostly blame me) originally entitled it “Supercoherent Quantum Bits.”  PRL, however, has a policy about new words coined by authors, and, while we almost made it to the end without the referee or editor noticing, they made us change the title because “Supercoherent Quantum Bits” would be a new word.  Who would have thought that being a PRL editor meant you had to be a defender of the lexicon?  (Good thing Ben didn’t include qubits in his title.)
Which brings me to the subject of this post.  This is a cool paper.  It shows that a very nice quantum error correcting code due to Bravyi and Haah admits a transversal (all at once now, comrades!) controlled-controlled-phase gate, and that this, combined with another transversal gate (everyone’s fav the Hadamard) and fault-tolerant quantum error correction is universal for quantum computation.  This shows a way to not have to use state distillation for quantum error correction to perform fault-tolerant quantum computing, which is exciting for those of us who hope to push the quantum computing threshold through the roof with resources available to even a third world quantum computing company.
What does this have to do with PRL?  Well this paper has four pages.  I don’t know if it is going to be submitted or has already been accepted at PRL, but it has that marker that sets off my PRL radar, bing bing bing!  And now here is an interesting thing I found in this paper.  The awesome amazing very cool code in this paper  is defined via its stabilizer

I I I I I I IXXXXXXXX; I I I I I I I ZZZZZZZZ,
I I IXXXXI I I IXXXX; I I I ZZZZ I I I I ZZZZ,
IXXI IXXI IXXI IXX; I ZZ I I ZZ I I ZZ I I ZZ,
XIXIXIXIXIXIXIX; Z I Z I Z I Z I Z I Z I Z I Z,

This takes up a whopping 4 lines of the article.  Whereas the disclaimer, in the acknowledgements reads

The U.S. Government is authorized to
reproduce and distribute reprints for Governmental pur-
poses notwithstanding any copyright annotation thereon.
Disclaimer: The views and conclusions contained herein
are those of the authors and should not be interpreted
as necessarily representing the official policies or endorse-
ments, either expressed or implied, of IARPA, DoI/NBC,
or the U.S. Government.

Now I’m not some come-of-age tea party enthusiast who yells at the government like a coyote howls at the moon (I went to Berkeley damnit, as did my parents before me.)  But really, have we come to a point where the god-damn disclaimer on an important paper is longer than the actual definition of the code that makes the paper so amazing?
Before I became a ghost pontiff, I had to raise money from many different three, four, and five letter agencies.  I’ve got nothing but respect for the people who worked the jobs that help supply funding for large research areas like quantum computing.  In fact I personally think we probably need even more people to execute on the civic duty of getting funding to the most interesting and most trans-form-ative long and short term research projects. But really?  A disclaimer longer than the code which the paper is about?  Disclaiming, what exactly?  Erghhh.

Non-chaotic irregularity

In principle, barring the intervention of chance, identical causes lead to identical effects.  And except in chaotic systems, similar causes lead to similar effects.  Borges’ story “Pierre Menard” exemplifies an extreme version of this idea: an early 20’th century writer studies Cervantes’ life and times so thoroughly that he is able to recreate several chapters of “Don Quixote” without mistakes and without consulting the original.
Meanwhile, back at the ShopRite parking lot in Croton on Hudson, NY,  they’d installed half a dozen identical red and white parking signs, presumably all from the same print run, and all posted in similar environments, except for two in a sunnier location.

The irregular patterns of cracks that formed as the signs weathered were so similar that at first I thought the cracks had also been printed, but then I noticed small differences. The sharp corners on letters like S and E,  apparently points of high stress, usually triggered near-identical cracks in each sign, but not always, and in the sunnier signs many additional fine cracks formed. 
Another example of reproducibly irregular dynamics was provided over 30 years ago by Ahlers and Walden’s experiments on convective turbulence, where a container of normal liquid helium, heated from below, exhibited nearly the same sequence of temperature fluctuations in several runs of the experiment.

 
 
 

Test your intuition

The name of this post was shamelessly stolen from Gil Kalai’s popular series Test Your Intuition. But today’s post will be testing our physics intuition, rather than our mathematical intuition. Although this is a quantum blog, we’ll look at the behavior of a classical fluid.
The question is: what happens when you soak a washcloth with water and then ring it out… in zero gravity?
Think about it for a few minutes before watching the result of the actual experiment below.

Throwing cold water on the Quantum Internet

There has been a lot of loose talk lately about a coming “Quantum Internet”. I was asked about it recently by a journalist and gave him this curmudgeonly answer, hoping to redirect some of the naive enthusiasm:
…First let me remark that “quantum internet” has become a bit of a buzzword that can lead to an inaccurate idea of the likely role of quantum information in a future global information infrastructure.  Although quantum concepts such as qubit and entanglement have revolutionized our understanding of the nature of information, I believe that the Internet will remain almost entirely classical, with quantum communication and computation being used for a few special purposes, where the unique capabilities of quantum information are needed.  For other tasks, where coherent control of the quantum state is not needed, classical processing will suffice and will remain cheaper, faster, and more reliable for the foreseeable future.  Of course there is a chance that quantum computers and communications links may some day become so easy to build that they are widely used for general purpose computing and communication, but I think it highly unlikely.
Would the quantum internet replace the classical one, or would the two somehow coexist?
As remarked above, I think the two would coexist, with the Internet remaining mostly classical. Quantum communication will be used for special purposes, like sharing cryptographic keys, and quantum computing will be used in those few situations where it gives a significant speed advantage (factoring large numbers, some kinds of search, and the simulation of quantum systems), or for the processing of inherently quantum signals (say from a physics experiment).
Would quantum search engines require qubits transmitted between the user’s computer and the web searcher’s host? Or would they simply use a quantum computer performing the search on the host machine, which could then return its findings classically?
It’s not clear that quantum techniques would help search engines, either in transmitting the data to the search engine, or in performing the search itself. Grover’s algorithm (where coherent quantum searching gives a quadratic speedup over classical searching) is less applicable to the large databases on which search engines operate, than to problems like the traveling salesman problem, where the search takes place not over a physical database, but over an exponentially large space of virtual possibilities determined by a small amount of physical data.
On the other hand, quantum techniques could play an important supporting role, not only for search engines but other Internet applications, by helping authenticate and encrypt classical communications, thereby making the Internet more secure. And as I said earlier, dedicated quantum computers could be used for certain classically-hard problems like factoring, searches over virtual spaces, simulating quantum systems, and processing quantum data.
When we talk about quantum channels do we mean a quantum communication link down which qubits can be sent and which prevents them decohering, or are these channels always an entangled link? …
A quantum channel of the sort you describe is needed, both to transmit quantum signals  and to share entanglement. After entanglement has been shared, if one has a quantum memory, it can be stored and used later in combination with a classical channel to transmit qubits.  This technique is called quantum teleportation (despite this name, for which I am to blame, quantum teleportation cannot be used for transporting material objects).
But could we ever hope for quantum communication in which no wires are needed – but entanglement handles everything?
The most common misconception about entanglement is that it can be used to communicate—transmit information from a sender to a receiver—perhaps even instantaneously. In fact it cannot communicate at all, except when assisted by a classical or quantum channel, neither of which communicate faster than the speed of light. So a future Internet will need wires, radio links, optical fibers, or other kinds of communications links, mostly classical, but also including a few quantum channels.
How soon before the quantum internet could arrive?
I don’t think there will ever be an all-quantum or mostly-quantum internet. Quantum cryptographic systems are already in use in a few places, and I think can fairly be said to have proven potential for improving cybersecurity. Within a few decades I think there will be practical large-scale quantum computers, which will be used to solve some problems intractable on any present or foreseeable classical computer, but they will not replace classical computers for most problems. I think the Internet as a whole will continue to consist mostly of classical computers, communications links, and data storage devices.
Given that the existing classical Internet is not going away, what sort of global quantum infrastructure can we expect, and what would it be used for?  Quantum cryptographic key distribution, the most mature quantum information application, is already deployed over short distances today (typically < 100 km).   Planned experiments between ground stations and satellites in low earth orbit promise to increase this range several fold.  The next and more important stage, which depends on further progress in quantum memory and error correction,  will probably be the development of a network of quantum repeaters, allowing entanglement to be generated between any two nodes in the network, and, more importantly, stockpiled and stored until needed.  Aside from its benefits for cybersecurity (allowing quantum-generated cryptographic keys to be shared  between any two nodes without having to trust the intermediate nodes) such a globe-spanning quantum repeater network will have important scientific applications, for example allowing coherent quantum measurements to be made on astronomical signals over intercontinental distances.  Still later, one can expect full scale quantum computers to be developed and attached to the repeater network.  We would then finally have achieved a capacity for fully general processing of quantum information, both locally and globally—an expensive, low-bandwidth quantum internet if you will—to be used in conjunction with the cheap high-bandwidth classical Internet when the unique capabilities of quantum information processing are needed.

Quantum Frontiers

As a postdoc at Caltech, I would often have lunch with John Preskill.  About once per week, we would play a game. During the short walk back, I would think of a question to which I didn’t know the answer. Then with maybe 100 meters to go, I would ask John that question. He would have to answer the question via a 20 minute impromptu lecture given right away, as soon as we walked into the building.
Now, these were not easy questions. At least, not to your average person, or even your average physicist. For example, “John, why do neutrinos have a small but nonzero mass?” Perhaps any high-energy theorist worth their salt would know the answer to that question, but it simply isn’t part of the training for most physicists, especially those in quantum information science.
Every single time, John would give a clear, concise and logically well-organized answer to the question at hand. He never skimped on equations when they were called for, but he would often analyze these problems using simple symmetry arguments and dimensional analysis—undergraduate physics!  At the end of each lecture, you really felt like you understood the answer to the question that was asked, which only moments ago seemed like it might be impossible to answer.
But the point of this post is not to praise John. Insead, I’m writing it so that I can set high expectations for John’s new blog, called Quantum Frontiers. Yes, that’s right, John Preskill has a blog now, and I hope that he’ll exceed these high expectations with content of similar or higher quality to what I witnessed in those after-lunch lectures. (John, if you’re reading this, no pressure.)
And John won’t be the only one blogging. It seems that the entire Caltech IQIM will “bring you firsthand accounts of the groundbreaking research taking place inside the labs of IQIM, and to answer your questions about our past, present and future work on some of the most fascinating questions at the frontiers of quantum science.”
This sounds pretty exciting, and it’s definitely a welcome addition to the (underrepresented?) quantum blogosphere.

What increases when a self-organizing system organizes itself? Logical depth to the rescue.

(An earlier version of this post appeared in the latest newsletter of the American Physical Society’s special interest group on Quantum Information.)
One of the most grandly pessimistic ideas from the 19th century is that of  “heat death” according to which a closed system, or one coupled to a single heat bath at thermal  equilibrium,  eventually inevitably settles into an uninteresting state devoid of life or macroscopic motion.  Conversely, in an idea dating back to Darwin and Spencer, nonequilibrium boundary conditions are thought to have caused or allowed the biosphere to self-organize over geological time.  Such godless creation, the bright flip side of the godless hell of heat death, nowadays seems to worry creationists more than Darwin’s initially more inflammatory idea that people are descended from apes. They have fought back, using superficially scientific arguments, in their masterful peanut butter video.
Self-organization versus heat death
Much simpler kinds of complexity generation occur in toy models with well-defined dynamics, such as this one-dimensional reversible cellular automaton.  Started from a simple initial condition at the left edge (periodic, but with a symmetry-breaking defect) it generates a deterministic wake-like history of growing size and complexity.  (The automaton obeys a second order transition rule, with a site’s future differing from its past iff exactly two of its first and second neighbors in the current time slice, not counting the site itself, are black and the other two are white.)

Fig 2.

Time →
But just what is it that increases when a self-organizing system organizes itself?
Such organized complexity is not a thermodynamic potential like entropy or free energy.  To see this, consider transitions between a flask of sterile nutrient solution and the bacterial culture it would become if inoculated by a single seed bacterium.  Without the seed bacterium, the transition from sterile nutrient to bacterial culture is allowed by the Second Law, but prohibited by a putative “slow growth law”, which prohibits organized complexity from increasing quickly, except with low probability.
Fig. 3

The same example shows that organized complexity is not an extensive quantity like free energy.  The free energy of a flask of sterile nutrient would be little altered by adding a single seed bacterium, but its organized complexity must have been greatly increased by this small change.  The subsequent growth of many bacteria is accompanied by a macroscopic drop in free energy, but little change in organized complexity.
The relation between universal computer programs and their outputs has long been viewed as a formal analog of the relation between theory and phenomenology in science, with the various programs generating a particular output x being analogous to alternative explanations of the phenomenon x.  This analogy draws its authority from the ability of universal computers to execute all formal deductive processes and their presumed ability to simulate all processes of physical causation.
In algorithmic information theory the Kolmogorov complexity, of a bit string x as defined as the size in bits of its minimal program x*, the smallest (and lexicographically first, in case of ties) program causing a standard universal computer U to produce exactly x as output and then halt.
 x* = min{p: U(p)=x}
Because of the ability of universal machines to simulate one another, a string’s Kolmogorov complexity is machine-independent up to a machine-dependent additive constant, and similarly is equal to within an additive constant to the string’s algorithmic entropy HU(x), the negative log of the probability that U would output exactly x and halt if its program were supplied by coin tossing.    Bit strings whose minimal programs are no smaller than the string itself are called incompressible, or algorithmically random, because they lack internal structure or correlations that would allow them to be specified more concisely than by a verbatim listing. Minimal programs themselves are incompressible to within O(1), since otherwise their minimality would be undercut by a still shorter program.  By contrast to minimal programs, any program p that is significantly compressible is intrinsically implausible as an explanation for its output, because it contains internal redundancy that could be removed by deriving it from the more economical hypothesis p*.  In terms of Occam’s razor, a program that is compressible by s bits deprecated as an explanation of its output because it suffers from  s bits worth of ad-hoc assumptions.
Though closely related[1] to statistical entropy, Kolmogorov complexity itself is not a good measure of organized complexity because it assigns high complexity to typical random strings generated by coin tossing, which intuitively are trivial and unorganized.  Accordingly many authors have considered modified versions of Kolmogorov complexity—also measured in entropic units like bits—hoping thereby to quantify the nontrivial part of a string’s information content, as opposed to its mere randomness.  A recent example is Scott Aaronson’s notion of complextropy, defined roughly as the number of bits in the smallest program for a universal computer to efficiently generate a probability distribution relative to which x  cannot efficiently be recognized as atypical.
However, I believe that entropic measures of complexity are generally unsatisfactory for formalizing the kind of complexity found in intuitively complex objects found in nature or gradually produced from simple initial conditions by simple dynamical processes, and that a better approach is to characterize an object’s complexity by the amount of number-crunching (i.e. computation time, measured in machine cycles, or more generally other dynamic computational resources such as time, memory, and parallelism) required to produce the object from a near-minimal-sized description.
This approach, which I have called  logical depth, is motivated by a common feature of intuitively organized objects found in nature: the internal evidence they contain of a nontrivial causal history.  If one accepts that an object’s minimal program represents its most plausible explanation, then the minimal program’s run time represents the number of steps in its most plausible history.  To make depth stable with respect to small variations in x or U, it is necessary also to consider programs other than the minimal one, appropriately weighted according to their compressibility, resulting in the following two-parameter definition.

  • An object  is called  d-deep with  s  bits significance iff every program for U to compute x in time <d is compressible by at least s bits. This formalizes the idea that every hypothesis for  x  to have originated more quickly than in time  d  contains  s bits worth of ad-hoc assumptions.

Dynamic and static resources, in the form of the parameters  d  and  s,  play complementary roles in this definition:  d  as the quantifier and  s  as the certifier of the object’s nontriviality.  Invoking the two parameters in this way not only stabilizes depth   with respect to small variations of  x and U, but also makes it possible to prove that depth obeys a slow growth law, without which any mathematically definition of organized complexity would seem problematic.

  • A fast deterministic process cannot convert shallow objects to deep ones, and a fast stochastic process can only do so with low probability.  (For details see Bennett88.)

 
Logical depth addresses many infelicities and problems associated with entropic measures of complexity.

  • It does not impose an arbitrary rate of exchange between the independent variables of strength of evidence and degree of nontriviality of what the evidence points to, nor an arbitrary maximum complexity that an object can have, relative to its size.  Just as a microscopic fossil can validate an arbitrarily long evolutionary process, so a small fragment of a large system, one that has evolved over a long time to a deep state, can contain evidence of entire depth of the large system, which may be more than exponential in the size of the fragment.
  • It helps explain the increase of complexity at early times and its decrease at late times by providing different mechanisms for these processes.  In figure 2, for example, depth increases steadily at first because it reflects the duration of the system’s actual history so far.  At late times, when the cellular automaton has run for a generic time comparable to its Poincare recurrence time, the state becomes shallow again, not because the actual history was uneventful, but because evidence of that history has become degraded to the point of statistical insignificance, allowing the final state to be generated quickly from a near-incompressible program that short-circuits the system’s actual history.
  • It helps explain while some systems, despite being far from thermal equilibrium, never self-organize.  For example in figure 1, the gaseous sun, unlike the solid earth, appears to lack means of remembering many details about its distant past.  Thus while it contains evidence of its age (e.g. in its hydrogen/helium ratio) almost all evidence of particular details of its past, e.g. the locations of sunspots, are probably obliterated fairly quickly by the sun’s hot, turbulent dynamics.  On the other hand, systems with less disruptive dynamics, like our earth, could continue increasing in depth for as long as their nonequilibrium boundary conditions persisted, up to an exponential maximum imposed by Poincare recurrence.
  • Finally, depth is robust with respect to transformations that greatly alter an object’s size and Kolmogorov complexity, and many other entropic quantities, provided the transformation leaves intact significant evidence of a nontrivial history. Even a small sample of the biosphere, such as a single DNA molecule, contains such evidence.  Mathematically speaking, the depth of a string x is not much altered by replicating it (like the bacteria in the flask), padding it with zeros or random digits, or passing it though a noisy channel (although the latter treatment decreases the significance parameter s).  If the whole history of the earth were derandomized, by substituting deterministic pseudorandom choices for all its stochastic accidents, the complex objects in this substitute world would have very little Kolmogorov complexity, yet their depth would be about the same as if they had resulted from a stochastic evolution.

The remaining infelicities of logical depth as a complexity measure are those afflicting computational complexity and algorithmic entropy theories generally.

  • Lack of tight lower bounds: because of open P=PSPACE question one cannot exhibit a system that provably generates depth more than polynomial in the space used.
  • Semicomputability:  deep objects can be proved deep (with exponential effort) but shallow ones can’t be proved shallow.  The semicomputability of depth, like that of Kolmogorov complexity, is an unavoidable consequence of the unsolvability of the halting problem.

The following observations can be made partially mitigating these infelicities.

  • Using the theory of cryptographically strong pseudorandom functions one can argue (if such functions exist) that deep objects can be produced efficiently, in time polynomial and space polylogarithmic in their depth, and indeed that they are produced efficiently by some physical processes.
  • Semicomputability does not render a complexity measure entirely useless. Even though a particular string cannot be proved shallow, and requires an exponential amount of effort to prove it deep, the depth-producing properties of stochastic processes can be established, assuming the existence of cryptographically strong pseudorandom functions. This parallels the fact that while no particular string can be proved to be algorithmically random (incompressible), it can be proved that the statistically random process of coin tossing produces algorithmically random strings with high probability.

 
Granting that a logically deep object is one plausibly requiring a lot of computation to produce, one can consider various related notions:

  • An object  y  is deep relative to  x  if all near-minimal sized programs for computing  y  from  x  are slow-running.  Two shallow objects may be deep relative to one another, for example a random string and its XOR with a deep string.
  • An object can be called cryptic if it is computationally difficult to obtain a near- minimal sized program for the object from the object itself, in other words if any near-minimal sized program for x is deep relative to x.  One-way functions, if they exist, can be used to define cryptic objects; for example, in a computationally secure but information theoretically insecure cryptosystem, plaintexts should be cryptic relative to their ciphertexts.
  • An object can be called ambitious if, when presented to a universal computer as input, it causes the computer to embark on a long but terminating computation. Such objects, though they determine a long computation, do not contain evidence of it actually having been done.  Indeed they may be shallow and even algorithmically random.
  • An object can be called wise if it is deep and a large and interesting family of other deep objects are shallow relative to it. Efficient oracles for hard problems, such as the characteristic function of an NP-complete set, or the characteristic set K of the halting problem, are examples of wise objects.  Interestingly, Chaitin’s omega is an exponentially more compact oracle for the halting problem than K is, but it is so inefficient to use that it is shallow and indeed algorithmically random.

Further details about these notions can be found in Bennett88.  K.W. Regan in Dick Lipton’s blog discusses the logical depth of Bill Gasarch’s recently discovered solutions to the 17-17 and 18×18 four-coloring problem
I close with some comments on the relation between organized complexity and thermal disequilibrium, which since the 19th century has been viewed as an important, perhaps essential, prerequisite for self-organization.   Broadly speaking, locally interacting systems at thermal equilibrium obey the Gibbs phase rule, and its generalization in which the set of independent parameters is enlarged to include not only intensive variables like temperature, pressure and magnetic field, but also all parameters of the system’s Hamiltonian, such as local coupling constants.   A consequence of the Gibbs phase rule is that for generic values of the independent parameters, i.e. at a generic point in the system’s phase diagram, only one phase is thermodynamically stable.  This means that if a system’s independent parameters are set to generic values, and the system is allowed to come to equilibrium, its structure will be that of this unique stable Gibbs phase, with spatially uniform properties and typically short-range correlations.   Thus for generic parameter values, when a system is allowed to relax to thermal equilibrium, it entirely forgets its initial condition and history and exists in a state whose structure can be adequately approximated by stochastically sampling the distribution of microstates characteristic of that stable Gibbs phase.  Dissipative systems—those whose dynamics is not microscopically reversible or whose boundary conditions prevent them from ever attaining thermal equilibrium—are exempt from the Gibbs phase rule for reasons discussed in BG85, and so are capable, other conditions being favorable, of producing structures of unbounded depth and complexity in the long time limit. For further discussion and a comparison of logical depth with other proposed measures of organized complexity, see B90.
 


[1] An elementary result of algorithmic information theory is that for any probability ensemble of bit strings (representing e.g. physical microstates), the ensemble’s Shannon entropy differs from the expectation of its members’ algorithmic entropy by at most of the number of bits required to describe a good approximation to the ensemble.

 
 

Why the laity hope Einstein was wrong.

Although reputable news sources pointed out that most scientists think some more mundane explanation will be found for the too-early arrival of CERN-generated neutrinos in Gran Sasso, recently confirmed by a second round of experiments with much briefer pulse durations to exclude the most likely sources of systematic error, the take-home message for most non-scientists seems to have been “Einstein was wrong.  Things can go faster than light.”  Scientists trying to explain their skepticism often end up sounding closed-minded and arrogant.  People say, “Why don’t you take evidence of faster-than-light travel at face value, rather than saying it must be wrong because it disagrees with Einstein.”  The macho desire not to be bound by an arbitrary speed limit doubtless also helps explain why warp drives are such a staple of  science fiction.  At a recent dinner party, as my wife silently reminded me that a lecture on time dilation and Fitzgerald contraction would be inappropriate, the best I could come up with was an analogy to another branch of physics where where lay peoples’ intuition accords better with that of specialists:  I told them, without giving them any reason to believe me, that Einstein showed that faster-than-light travel would be about as far-reaching and disruptive in its consequences as an engine that required no fuel.
That was too crude an analogy. Certainly a fuelless engine, if it could be built, would be more disruptive in its practical consequences, whereas faster-than-light neutrinos could be accommodated, without creating any paradoxes of time travel, if there were a preferred reference frame within which neutrinos traveling through rock could go faster than light, while other particles, including neutrinos traveling though empty space, would behave in the usual Lorentz-invariant fashion supported by innumerable experiments and astronomical observations.
But it is wrong to blame mere populist distrust of authority for this disconnect between lay and expert opinion. Rather the fault lies with a failure of science education, leaving the public with a good intuition for Galilean relativity, but little understanding of how it has been superseded by special relativity.  So maybe, after dinner is over and my audience is no longer captive, I should retell the old story of cosmic ray-generated muons, who see the onrushing earth as having an atmosphere only a few feet thick, while terrestrial observers see the muons’ lifetime as having been extended manyfold by time dilation.
It is this difference in appreciation of special relativity that accounts for the fact that for most  people, faster-than-light travel seems far more plausible than time travel, whereas for experts, time travel, via closed timelike curves of general relativistic origin, is more plausible than faster-than-light travel in flat spacetime.