A jack of two trades and a master of both

IBM, in recounting its history of arranging high-profile contests between humans and computers, describes the tension surrounding the final chess game between Deep Blue and Gary Kasparov.  One of the machine’s designers “vividly remembers the final game of the six-game match in 1997. Deep Blue was so dominant that the game ended in less than two hours. His wife, Gina, had planned on arriving at the venue in the Equitable Center in New York City to watch the second half, but it was over before she arrived. ”  To my mind, the machine’s raw chess prowess is less remarkable than its ability to multitask between chess and marriage.

Ghost Paper Dance!

In a belated revival of the Ghost Pontiff’s “Happy Paper Dance” ritual, I’d like to talk about the recent paper The k-local Pauli Commuting Hamiltonians Problem is in P by my student Jijiang (Johnny) Yan and his former advisor, Dave Bacon. The abstract is:

Given a Hamiltonian that is a sum of commuting few-body terms, the commuting Hamiltonian problem is to determine if there exists a quantum state that is the simultaneous eigenstate of all of these terms that minimizes each term individually. This problem is known to be in the complexity class quantum Merlin-Arthur, but is widely thought to not be complete for this class. Here we show that a limited form of this problem when the individual terms are all made up of tensor products of Pauli matrices is efficiently solvable on a classical computer and thus in the complexity class P. The problem can be thought of as the classical XOR-SAT problem over a symplectic vector space. This class of problems includes instance Hamiltonians whose ground states possess topological entanglement, thus showing that such entanglement is not always a barrier for the more general problem.

This result follows a long string of papers that discuss the complexity of finding the ground state energy of k-local Hamiltonians, usually modified by various adjectives like “commuting” or “frustration-free” or “Pauli” or “in d dimensions.” Typically, these problems are shown to be somewhere between NP and QMA, and the subtle differences between these relate to issues such as topological order and the quantum PCP conjecture. In fact, one specific inspiration for this paper was 1102.0770, which showed that 3-qubit (or even 3-qutrit) commuting Hamiltonians could not have topologically-ordered ground states, while 4-qubit commuting Hamiltonians include the toric code, and 2-qubit non-commuting Hamiltonians include things that look like the toric code.
This paper shows that, in the case of commuting Pauli Hamiltonians, the difference between 3-local and 4-local is not important from a complexity point of view; indeed, it is possible to efficiently find the ground state of even O(n)-local Hamiltonians.
At first this is shocking, but to see why it’s reasonable to expect this result, consider classical (commuting, Pauli) Hamiltonians. Determining whether these terms has a simultaneous ground state is equivalent to solving a linear system of equations over mathbb{F}_2, which of course can be done in poly-time. This paper extends that to general Paulis, but the algorithm still involves solving linear systems of equations–this time over mathbb{F}_4. It is one of my favorite examples of the power and simplicity of the Pauli matrices, tied perhaps with the elegant Wehner-Winter uncertainty relations for anti-commuting observables.

|Democrat> + |Republican> / sqrt(2)?

It has long been known that party politics exhibits quantum effects. (An excerpt, that I’m sure is not retaliation for Sokal’s hoax, is “…we show evidence using the Smith et. al data that a tenet of a classical model that has animated work in the field appears violated in a form that gives way naturally to embrace of the superposition principle and then suggest that the classical formalisms and theories of preference separability might best be viewed as special cases of the quantum versions.”)
But what use is this theory, if we can’t apply it to any situations of practical relevance? Finally, the theory of quantum politics has found a way to explain current politics with A Quantum Theory of Mitt Romney, in an article that actually does a better job of explaining complementarity, uncertainty, etc. than do a lot of popular articles (quantum leap, not so much).
As an example of what this new theory explains, here is a Feynman diagram depicting a collision between a Romney and an anti-Romney that yields an electron and a $20 bill.

More on the NP-hardness of inferring dynamics

The previous post on David Voss’ APS piece quibbled perhaps excessively about the definition of NP, but neglected to mention  the actual subject of the piece, which was Cubitt, Eisert and Wolf’s (CEW) recent paper on the NP-hardness of extracting dynamical equations from experimental data.  This paper raises, and partly answers,  some subtle questions about the relation between computation and physics.  For example, one might ask, if the problem of inferring dynamics from experiment is intractable in general, doesn’t this doom the whole program of theoretical physics?  We will argue below that the results of CEW do not support such a  pessimistic view.  What CEW do show about the problem of inferring dynamics from observation (more details here) is that a seemingly easier problem—that of determining whether a given completely-positive trace-preserving map (representing a quantum system’s evolution over a discrete time interval) is consistent with some underlying Markov process operating in continuous time—is NP-complete.   Continuous time Markov processes, described by time-independent Lindblad operators, model the evolution of an open quantum system in contact with a memoryless environment.  Of course this is an approximation, since no environment can be entirely memoryless over very short time intervals, but it works quite well in many practical situations where a quantum system with only a few degrees of freedom interacts with a large, rapidly-relaxing environment such as a heat bath.  For such small systems  NP-completeness does not render a problem intractable, and the result of CEW can be used to infer a very nearly correct dynamical description—a Lindblad equation—from experimental data consisting of discrete time snapshots of the evolving open quantum system.
Suppose on the other hand we apply the CEW algorithm to some experimental data and find that it doesn’t fit any Markovian dynamics.  Then either the data is wrong or memory effects in the environment are too important to  be neglected.  To understand the dynamics of such systems we must treat the environment more respectfully, somehow modeling its most significant memory effects, or, if all else fails, “going to the church of the larger Hilbert space” and treating the system plus environment as a larger closed system, evolving unitarily.  This raises the question of whether an intractability result analogous to CEW’s finding for open systems also applies to unitarily evolving closed quantum systems.  We suspect that it does not, and that the problem of fitting a Hamiltonian to a series of snapshots of a unitarily evolving quantum system may be tractable, at least if the Hamiltonian is of the approximately local form commonly encountered in physics, and if the experimenter is free to choose the times of the snapshots.   The inferability of dynamics from experimental data, which as we indicated above underlies the whole program of theoretical physics, is, we believe, related to the quantum Church-Turing thesis,  that physical processes can be efficiently simulated by a quantum computer.
Be that as it may, what CEW show, in a positive sense, is how (albeit very laboriously for large systems) to infer Markovian dynamics when the Markovian approximation is justified, and in a negative sense, when one should abandon the Markovian approximation and infer the dynamics by other means.

Hardness of NP

In computer science, NP-hard problems are widely believed to be intractable, not because they have been proved so, but on the empirical evidence of no one having found a fast algorithm for any of them in over half a century of trying.  But the concepts of  NP-hardness and NP-completeness are themselves hard for newcomers to understand.   The current American Physical Society piece Unbearable Hardness of Physics makes a common mistake when it takes NP-hard problems to mean problems Not solvable in time Polynomial in the size of their input, rather than those to which all problems solvable in Nondeterministic Polynomial time are efficiently reducible.  Come to think of it, the letters N and P  also breed confusion in other fields, including our own, where  NPT is often taken to stand for Negative Partial Transpose, when it would be more correct to say Nonpositive Partial Transpose, admittedly a tiny imprecision compared to the confusion surrounding what NP means.
 

What increases when a self-organizing system organizes itself? Logical depth to the rescue.

(An earlier version of this post appeared in the latest newsletter of the American Physical Society’s special interest group on Quantum Information.)
One of the most grandly pessimistic ideas from the 19th century is that of  “heat death” according to which a closed system, or one coupled to a single heat bath at thermal  equilibrium,  eventually inevitably settles into an uninteresting state devoid of life or macroscopic motion.  Conversely, in an idea dating back to Darwin and Spencer, nonequilibrium boundary conditions are thought to have caused or allowed the biosphere to self-organize over geological time.  Such godless creation, the bright flip side of the godless hell of heat death, nowadays seems to worry creationists more than Darwin’s initially more inflammatory idea that people are descended from apes. They have fought back, using superficially scientific arguments, in their masterful peanut butter video.
Self-organization versus heat death
Much simpler kinds of complexity generation occur in toy models with well-defined dynamics, such as this one-dimensional reversible cellular automaton.  Started from a simple initial condition at the left edge (periodic, but with a symmetry-breaking defect) it generates a deterministic wake-like history of growing size and complexity.  (The automaton obeys a second order transition rule, with a site’s future differing from its past iff exactly two of its first and second neighbors in the current time slice, not counting the site itself, are black and the other two are white.)

Fig 2.

Time →
But just what is it that increases when a self-organizing system organizes itself?
Such organized complexity is not a thermodynamic potential like entropy or free energy.  To see this, consider transitions between a flask of sterile nutrient solution and the bacterial culture it would become if inoculated by a single seed bacterium.  Without the seed bacterium, the transition from sterile nutrient to bacterial culture is allowed by the Second Law, but prohibited by a putative “slow growth law”, which prohibits organized complexity from increasing quickly, except with low probability.
Fig. 3

The same example shows that organized complexity is not an extensive quantity like free energy.  The free energy of a flask of sterile nutrient would be little altered by adding a single seed bacterium, but its organized complexity must have been greatly increased by this small change.  The subsequent growth of many bacteria is accompanied by a macroscopic drop in free energy, but little change in organized complexity.
The relation between universal computer programs and their outputs has long been viewed as a formal analog of the relation between theory and phenomenology in science, with the various programs generating a particular output x being analogous to alternative explanations of the phenomenon x.  This analogy draws its authority from the ability of universal computers to execute all formal deductive processes and their presumed ability to simulate all processes of physical causation.
In algorithmic information theory the Kolmogorov complexity, of a bit string x as defined as the size in bits of its minimal program x*, the smallest (and lexicographically first, in case of ties) program causing a standard universal computer U to produce exactly x as output and then halt.
 x* = min{p: U(p)=x}
Because of the ability of universal machines to simulate one another, a string’s Kolmogorov complexity is machine-independent up to a machine-dependent additive constant, and similarly is equal to within an additive constant to the string’s algorithmic entropy HU(x), the negative log of the probability that U would output exactly x and halt if its program were supplied by coin tossing.    Bit strings whose minimal programs are no smaller than the string itself are called incompressible, or algorithmically random, because they lack internal structure or correlations that would allow them to be specified more concisely than by a verbatim listing. Minimal programs themselves are incompressible to within O(1), since otherwise their minimality would be undercut by a still shorter program.  By contrast to minimal programs, any program p that is significantly compressible is intrinsically implausible as an explanation for its output, because it contains internal redundancy that could be removed by deriving it from the more economical hypothesis p*.  In terms of Occam’s razor, a program that is compressible by s bits deprecated as an explanation of its output because it suffers from  s bits worth of ad-hoc assumptions.
Though closely related[1] to statistical entropy, Kolmogorov complexity itself is not a good measure of organized complexity because it assigns high complexity to typical random strings generated by coin tossing, which intuitively are trivial and unorganized.  Accordingly many authors have considered modified versions of Kolmogorov complexity—also measured in entropic units like bits—hoping thereby to quantify the nontrivial part of a string’s information content, as opposed to its mere randomness.  A recent example is Scott Aaronson’s notion of complextropy, defined roughly as the number of bits in the smallest program for a universal computer to efficiently generate a probability distribution relative to which x  cannot efficiently be recognized as atypical.
However, I believe that entropic measures of complexity are generally unsatisfactory for formalizing the kind of complexity found in intuitively complex objects found in nature or gradually produced from simple initial conditions by simple dynamical processes, and that a better approach is to characterize an object’s complexity by the amount of number-crunching (i.e. computation time, measured in machine cycles, or more generally other dynamic computational resources such as time, memory, and parallelism) required to produce the object from a near-minimal-sized description.
This approach, which I have called  logical depth, is motivated by a common feature of intuitively organized objects found in nature: the internal evidence they contain of a nontrivial causal history.  If one accepts that an object’s minimal program represents its most plausible explanation, then the minimal program’s run time represents the number of steps in its most plausible history.  To make depth stable with respect to small variations in x or U, it is necessary also to consider programs other than the minimal one, appropriately weighted according to their compressibility, resulting in the following two-parameter definition.

  • An object  is called  d-deep with  s  bits significance iff every program for U to compute x in time <d is compressible by at least s bits. This formalizes the idea that every hypothesis for  x  to have originated more quickly than in time  d  contains  s bits worth of ad-hoc assumptions.

Dynamic and static resources, in the form of the parameters  d  and  s,  play complementary roles in this definition:  d  as the quantifier and  s  as the certifier of the object’s nontriviality.  Invoking the two parameters in this way not only stabilizes depth   with respect to small variations of  x and U, but also makes it possible to prove that depth obeys a slow growth law, without which any mathematically definition of organized complexity would seem problematic.

  • A fast deterministic process cannot convert shallow objects to deep ones, and a fast stochastic process can only do so with low probability.  (For details see Bennett88.)

 
Logical depth addresses many infelicities and problems associated with entropic measures of complexity.

  • It does not impose an arbitrary rate of exchange between the independent variables of strength of evidence and degree of nontriviality of what the evidence points to, nor an arbitrary maximum complexity that an object can have, relative to its size.  Just as a microscopic fossil can validate an arbitrarily long evolutionary process, so a small fragment of a large system, one that has evolved over a long time to a deep state, can contain evidence of entire depth of the large system, which may be more than exponential in the size of the fragment.
  • It helps explain the increase of complexity at early times and its decrease at late times by providing different mechanisms for these processes.  In figure 2, for example, depth increases steadily at first because it reflects the duration of the system’s actual history so far.  At late times, when the cellular automaton has run for a generic time comparable to its Poincare recurrence time, the state becomes shallow again, not because the actual history was uneventful, but because evidence of that history has become degraded to the point of statistical insignificance, allowing the final state to be generated quickly from a near-incompressible program that short-circuits the system’s actual history.
  • It helps explain while some systems, despite being far from thermal equilibrium, never self-organize.  For example in figure 1, the gaseous sun, unlike the solid earth, appears to lack means of remembering many details about its distant past.  Thus while it contains evidence of its age (e.g. in its hydrogen/helium ratio) almost all evidence of particular details of its past, e.g. the locations of sunspots, are probably obliterated fairly quickly by the sun’s hot, turbulent dynamics.  On the other hand, systems with less disruptive dynamics, like our earth, could continue increasing in depth for as long as their nonequilibrium boundary conditions persisted, up to an exponential maximum imposed by Poincare recurrence.
  • Finally, depth is robust with respect to transformations that greatly alter an object’s size and Kolmogorov complexity, and many other entropic quantities, provided the transformation leaves intact significant evidence of a nontrivial history. Even a small sample of the biosphere, such as a single DNA molecule, contains such evidence.  Mathematically speaking, the depth of a string x is not much altered by replicating it (like the bacteria in the flask), padding it with zeros or random digits, or passing it though a noisy channel (although the latter treatment decreases the significance parameter s).  If the whole history of the earth were derandomized, by substituting deterministic pseudorandom choices for all its stochastic accidents, the complex objects in this substitute world would have very little Kolmogorov complexity, yet their depth would be about the same as if they had resulted from a stochastic evolution.

The remaining infelicities of logical depth as a complexity measure are those afflicting computational complexity and algorithmic entropy theories generally.

  • Lack of tight lower bounds: because of open P=PSPACE question one cannot exhibit a system that provably generates depth more than polynomial in the space used.
  • Semicomputability:  deep objects can be proved deep (with exponential effort) but shallow ones can’t be proved shallow.  The semicomputability of depth, like that of Kolmogorov complexity, is an unavoidable consequence of the unsolvability of the halting problem.

The following observations can be made partially mitigating these infelicities.

  • Using the theory of cryptographically strong pseudorandom functions one can argue (if such functions exist) that deep objects can be produced efficiently, in time polynomial and space polylogarithmic in their depth, and indeed that they are produced efficiently by some physical processes.
  • Semicomputability does not render a complexity measure entirely useless. Even though a particular string cannot be proved shallow, and requires an exponential amount of effort to prove it deep, the depth-producing properties of stochastic processes can be established, assuming the existence of cryptographically strong pseudorandom functions. This parallels the fact that while no particular string can be proved to be algorithmically random (incompressible), it can be proved that the statistically random process of coin tossing produces algorithmically random strings with high probability.

 
Granting that a logically deep object is one plausibly requiring a lot of computation to produce, one can consider various related notions:

  • An object  y  is deep relative to  x  if all near-minimal sized programs for computing  y  from  x  are slow-running.  Two shallow objects may be deep relative to one another, for example a random string and its XOR with a deep string.
  • An object can be called cryptic if it is computationally difficult to obtain a near- minimal sized program for the object from the object itself, in other words if any near-minimal sized program for x is deep relative to x.  One-way functions, if they exist, can be used to define cryptic objects; for example, in a computationally secure but information theoretically insecure cryptosystem, plaintexts should be cryptic relative to their ciphertexts.
  • An object can be called ambitious if, when presented to a universal computer as input, it causes the computer to embark on a long but terminating computation. Such objects, though they determine a long computation, do not contain evidence of it actually having been done.  Indeed they may be shallow and even algorithmically random.
  • An object can be called wise if it is deep and a large and interesting family of other deep objects are shallow relative to it. Efficient oracles for hard problems, such as the characteristic function of an NP-complete set, or the characteristic set K of the halting problem, are examples of wise objects.  Interestingly, Chaitin’s omega is an exponentially more compact oracle for the halting problem than K is, but it is so inefficient to use that it is shallow and indeed algorithmically random.

Further details about these notions can be found in Bennett88.  K.W. Regan in Dick Lipton’s blog discusses the logical depth of Bill Gasarch’s recently discovered solutions to the 17-17 and 18×18 four-coloring problem
I close with some comments on the relation between organized complexity and thermal disequilibrium, which since the 19th century has been viewed as an important, perhaps essential, prerequisite for self-organization.   Broadly speaking, locally interacting systems at thermal equilibrium obey the Gibbs phase rule, and its generalization in which the set of independent parameters is enlarged to include not only intensive variables like temperature, pressure and magnetic field, but also all parameters of the system’s Hamiltonian, such as local coupling constants.   A consequence of the Gibbs phase rule is that for generic values of the independent parameters, i.e. at a generic point in the system’s phase diagram, only one phase is thermodynamically stable.  This means that if a system’s independent parameters are set to generic values, and the system is allowed to come to equilibrium, its structure will be that of this unique stable Gibbs phase, with spatially uniform properties and typically short-range correlations.   Thus for generic parameter values, when a system is allowed to relax to thermal equilibrium, it entirely forgets its initial condition and history and exists in a state whose structure can be adequately approximated by stochastically sampling the distribution of microstates characteristic of that stable Gibbs phase.  Dissipative systems—those whose dynamics is not microscopically reversible or whose boundary conditions prevent them from ever attaining thermal equilibrium—are exempt from the Gibbs phase rule for reasons discussed in BG85, and so are capable, other conditions being favorable, of producing structures of unbounded depth and complexity in the long time limit. For further discussion and a comparison of logical depth with other proposed measures of organized complexity, see B90.
 


[1] An elementary result of algorithmic information theory is that for any probability ensemble of bit strings (representing e.g. physical microstates), the ensemble’s Shannon entropy differs from the expectation of its members’ algorithmic entropy by at most of the number of bits required to describe a good approximation to the ensemble.

 
 

Having it both ways

In one of Jorge Luis Borges’ historical fictions, an elderly Averroes, remarking on a misguided opinion of his youth,  says that to be free of an error it is well to have professed it oneself.  Something like this seems to have happened on a shorter time scale in the ArXiv, with last November’s The quantum state cannot be interpreted statistically  sharing two authors with this January’s The quantum state can be interpreted statistically.  The more recent paper explains that the two results are actually consistent because the later paper abandons the earlier paper’s assumption that independent preparations result in an ontic state of product form.  To us this seems an exceedingly natural assumption, since it is hard to see how inductive inference would work in a world where independent preparations did not result in independent states.  To their credit,  and unlike flip-flopping politicians, the authors do not advocate or defend their more recent position; they only assert that it is logically consistent.

Debate!

Before I did math, I did debate.
And now I’ve found a way to reconcile the two! Gil Kalai is a great researcher in theoretical computer science who has written several articles that are skeptical of the possibility of quantum computers ever being built. I think his ideas are interesting, but wrong.
So we’ve agreed to debate on the neutral territory of Dick Lipton and Ken Regan’s blog, Godel’s Lost Letter.
The debate begins with Gil laying out his case for skepticism of quantum computing.
In a few days, I’ll post my reply there. Stay tuned…