More on the NP-hardness of inferring dynamics

The previous post on David Voss’ APS piece quibbled perhaps excessively about the definition of NP, but neglected to mention  the actual subject of the piece, which was Cubitt, Eisert and Wolf’s (CEW) recent paper on the NP-hardness of extracting dynamical equations from experimental data.  This paper raises, and partly answers,  some subtle questions about the relation between computation and physics.  For example, one might ask, if the problem of inferring dynamics from experiment is intractable in general, doesn’t this doom the whole program of theoretical physics?  We will argue below that the results of CEW do not support such a  pessimistic view.  What CEW do show about the problem of inferring dynamics from observation (more details here) is that a seemingly easier problem—that of determining whether a given completely-positive trace-preserving map (representing a quantum system’s evolution over a discrete time interval) is consistent with some underlying Markov process operating in continuous time—is NP-complete.   Continuous time Markov processes, described by time-independent Lindblad operators, model the evolution of an open quantum system in contact with a memoryless environment.  Of course this is an approximation, since no environment can be entirely memoryless over very short time intervals, but it works quite well in many practical situations where a quantum system with only a few degrees of freedom interacts with a large, rapidly-relaxing environment such as a heat bath.  For such small systems  NP-completeness does not render a problem intractable, and the result of CEW can be used to infer a very nearly correct dynamical description—a Lindblad equation—from experimental data consisting of discrete time snapshots of the evolving open quantum system.
Suppose on the other hand we apply the CEW algorithm to some experimental data and find that it doesn’t fit any Markovian dynamics.  Then either the data is wrong or memory effects in the environment are too important to  be neglected.  To understand the dynamics of such systems we must treat the environment more respectfully, somehow modeling its most significant memory effects, or, if all else fails, “going to the church of the larger Hilbert space” and treating the system plus environment as a larger closed system, evolving unitarily.  This raises the question of whether an intractability result analogous to CEW’s finding for open systems also applies to unitarily evolving closed quantum systems.  We suspect that it does not, and that the problem of fitting a Hamiltonian to a series of snapshots of a unitarily evolving quantum system may be tractable, at least if the Hamiltonian is of the approximately local form commonly encountered in physics, and if the experimenter is free to choose the times of the snapshots.   The inferability of dynamics from experimental data, which as we indicated above underlies the whole program of theoretical physics, is, we believe, related to the quantum Church-Turing thesis,  that physical processes can be efficiently simulated by a quantum computer.
Be that as it may, what CEW show, in a positive sense, is how (albeit very laboriously for large systems) to infer Markovian dynamics when the Markovian approximation is justified, and in a negative sense, when one should abandon the Markovian approximation and infer the dynamics by other means.

Hardness of NP

In computer science, NP-hard problems are widely believed to be intractable, not because they have been proved so, but on the empirical evidence of no one having found a fast algorithm for any of them in over half a century of trying.  But the concepts of  NP-hardness and NP-completeness are themselves hard for newcomers to understand.   The current American Physical Society piece Unbearable Hardness of Physics makes a common mistake when it takes NP-hard problems to mean problems Not solvable in time Polynomial in the size of their input, rather than those to which all problems solvable in Nondeterministic Polynomial time are efficiently reducible.  Come to think of it, the letters N and P  also breed confusion in other fields, including our own, where  NPT is often taken to stand for Negative Partial Transpose, when it would be more correct to say Nonpositive Partial Transpose, admittedly a tiny imprecision compared to the confusion surrounding what NP means.
 

The Nine Circles of LaTeX Hell

This guy had an overfull hbox
This guy had an overfull hbox
Poorly written LaTeX is like a rash. No, you won’t die from it, but it needlessly complicates your life and makes it difficult to focus on pertinent matters. The victims of this unfortunate blight can be both the readers, as in the case of bad typography, or yourself and your coauthors, as in the case of bad coding practice.
Today, in an effort to combat this particular scourge (and in keeping with the theme of this blog’s title), I will be your Virgil on a tour through the nine circles of LaTeX hell. My intention is not to shock or frighten you, dear Reader. I hope, like Dante before me, to promote a more virtuous lifestyle by displaying the wages of sin. However, unlike Dante I will refrain from pointing out all the famous people that reside in these various levels of hell. (I’m guessing Dante already had tenure when he wrote The Inferno.)
The infractions will be minor at first, becoming progressively more heinous, until we reach utter depravity at the ninth level. Let us now descend the steep and savage path.

1) Using {\it ...} and {\bf ...}, etc.

Admittedly, this point is a quibble at best, but let me tell you why to use \textit{...} and \textbf{...} instead. First, \it and \bf don’t perform correctly under composition (in fact, they reset all font attributes), so {\it {\bf ...}} does not produce bold italics as you might expect. Second, it fails to correct the spacing for italicized letters. Compare \text{{\it test}text} to \text{\textit{test}text} and notice how crowded the former is.

2) Using \def

\def is a plain TeX command that writes macros without first checking if there was a preexisting macro. Hence it will overwrite something without producing an error message. This one can be dangerous if you have coauthors: maybe you use \E to mean \mathbb{E}, while your coauthor uses it to mean \mathcal{E}. If you are writing different sections of the paper, then you might introduce some very confusing typos. Use \newcommand or \renewcommand instead.

3) Using $$...$$

This one is another plain TeX command. It messes up vertical spacing within formulas, making them inconsistent, and it causes fleqn to stop working. Moreover, it is syntactically harder to parse since you can’t detect an unmatched pair as easily. Using [...] avoids these issues.

4) Using the eqnarray environment

If you don’t believe me that eqnarray is bad news, then ask Lars Madson, the author of “Avoid eqnarray!”, a 10 page treatise on the subject. It handles spacing in an inconsistent manner and will overwrite the equation numbers for long equations. You should use the amsmath package with the align environment instead.
Now we begin reaching the inner circles of LaTeX hell, where the crimes become noticeably more sinister.

5) Using standard size parentheses around display size fractions

Consider the following abomination: (\displaystyle \frac{1+x}{2})^n (\frac{x^k}{1+x^2})^m = (\int_{-\infty}^{\infty} \mathrm{e}^{-u^2}\mathrm{d}u )^2.
Go on, stare at this for one minute and see if you don’t want to tear your eyes out. Now you know how your reader feels when you are too lazy to use \left and \right.

6) Not using bibtex

Manually writing your bibliography makes it more likely that you will make a mistake and adds a huge unnecessary workload to yourself and your coauthors. If you don’t already use bibtex, then make the switch today.

7) Using text in math mode

Writing H_{effective} is horrendous, but even H_{eff} is an affront. The broken ligature makes these examples particularly bad. There are lots of ways to avoid this, like using \text or \mathrm, which lead to the much more elegant and legible H_{\text{eff}}. Don’t use \mbox, though, because it doesn’t get the font size right: H_{\mbox{eff}}.

8 ) Using a greater-than sign for a ket

At this level of hell in Dante’s Inferno, some of the accursed are being whipped by demons for all eternity. This seems to be about the right level of punishment for people who use the obscenity |\psi>.

9) Not using TeX or LaTeX

This one is so bad, it tops Scott’s list of signs that a claimed mathematical breakthrough is wrong. If you are typing up your results in Microsoft Word using Comic Sans font, then perhaps you should be filling out TPS reports instead of writing scientific papers.

Scott Aaronson wins Alan T. Waterman Award

The NSF just announced that our own Scott Aaronson has been named a co-recipient of this year’s prestigious Alan T. Waterman award! The award is granted to outstanding researchers under the age of 35 across any field of science or engineering which is supported by the NSF.
Not only is this great news for Scott, but a rising tide lifts all boats: the entire field of quantum computing benefits when our talented researchers get recognition for their achievements.
Congratulations to The Optimizer on this richly deserved award.

What increases when a self-organizing system organizes itself? Logical depth to the rescue.

(An earlier version of this post appeared in the latest newsletter of the American Physical Society’s special interest group on Quantum Information.)
One of the most grandly pessimistic ideas from the 19th century is that of  “heat death” according to which a closed system, or one coupled to a single heat bath at thermal  equilibrium,  eventually inevitably settles into an uninteresting state devoid of life or macroscopic motion.  Conversely, in an idea dating back to Darwin and Spencer, nonequilibrium boundary conditions are thought to have caused or allowed the biosphere to self-organize over geological time.  Such godless creation, the bright flip side of the godless hell of heat death, nowadays seems to worry creationists more than Darwin’s initially more inflammatory idea that people are descended from apes. They have fought back, using superficially scientific arguments, in their masterful peanut butter video.
Self-organization versus heat death
Much simpler kinds of complexity generation occur in toy models with well-defined dynamics, such as this one-dimensional reversible cellular automaton.  Started from a simple initial condition at the left edge (periodic, but with a symmetry-breaking defect) it generates a deterministic wake-like history of growing size and complexity.  (The automaton obeys a second order transition rule, with a site’s future differing from its past iff exactly two of its first and second neighbors in the current time slice, not counting the site itself, are black and the other two are white.)

Fig 2.

Time →
But just what is it that increases when a self-organizing system organizes itself?
Such organized complexity is not a thermodynamic potential like entropy or free energy.  To see this, consider transitions between a flask of sterile nutrient solution and the bacterial culture it would become if inoculated by a single seed bacterium.  Without the seed bacterium, the transition from sterile nutrient to bacterial culture is allowed by the Second Law, but prohibited by a putative “slow growth law”, which prohibits organized complexity from increasing quickly, except with low probability.
Fig. 3

The same example shows that organized complexity is not an extensive quantity like free energy.  The free energy of a flask of sterile nutrient would be little altered by adding a single seed bacterium, but its organized complexity must have been greatly increased by this small change.  The subsequent growth of many bacteria is accompanied by a macroscopic drop in free energy, but little change in organized complexity.
The relation between universal computer programs and their outputs has long been viewed as a formal analog of the relation between theory and phenomenology in science, with the various programs generating a particular output x being analogous to alternative explanations of the phenomenon x.  This analogy draws its authority from the ability of universal computers to execute all formal deductive processes and their presumed ability to simulate all processes of physical causation.
In algorithmic information theory the Kolmogorov complexity, of a bit string x as defined as the size in bits of its minimal program x*, the smallest (and lexicographically first, in case of ties) program causing a standard universal computer U to produce exactly x as output and then halt.
 x* = min{p: U(p)=x}
Because of the ability of universal machines to simulate one another, a string’s Kolmogorov complexity is machine-independent up to a machine-dependent additive constant, and similarly is equal to within an additive constant to the string’s algorithmic entropy HU(x), the negative log of the probability that U would output exactly x and halt if its program were supplied by coin tossing.    Bit strings whose minimal programs are no smaller than the string itself are called incompressible, or algorithmically random, because they lack internal structure or correlations that would allow them to be specified more concisely than by a verbatim listing. Minimal programs themselves are incompressible to within O(1), since otherwise their minimality would be undercut by a still shorter program.  By contrast to minimal programs, any program p that is significantly compressible is intrinsically implausible as an explanation for its output, because it contains internal redundancy that could be removed by deriving it from the more economical hypothesis p*.  In terms of Occam’s razor, a program that is compressible by s bits deprecated as an explanation of its output because it suffers from  s bits worth of ad-hoc assumptions.
Though closely related[1] to statistical entropy, Kolmogorov complexity itself is not a good measure of organized complexity because it assigns high complexity to typical random strings generated by coin tossing, which intuitively are trivial and unorganized.  Accordingly many authors have considered modified versions of Kolmogorov complexity—also measured in entropic units like bits—hoping thereby to quantify the nontrivial part of a string’s information content, as opposed to its mere randomness.  A recent example is Scott Aaronson’s notion of complextropy, defined roughly as the number of bits in the smallest program for a universal computer to efficiently generate a probability distribution relative to which x  cannot efficiently be recognized as atypical.
However, I believe that entropic measures of complexity are generally unsatisfactory for formalizing the kind of complexity found in intuitively complex objects found in nature or gradually produced from simple initial conditions by simple dynamical processes, and that a better approach is to characterize an object’s complexity by the amount of number-crunching (i.e. computation time, measured in machine cycles, or more generally other dynamic computational resources such as time, memory, and parallelism) required to produce the object from a near-minimal-sized description.
This approach, which I have called  logical depth, is motivated by a common feature of intuitively organized objects found in nature: the internal evidence they contain of a nontrivial causal history.  If one accepts that an object’s minimal program represents its most plausible explanation, then the minimal program’s run time represents the number of steps in its most plausible history.  To make depth stable with respect to small variations in x or U, it is necessary also to consider programs other than the minimal one, appropriately weighted according to their compressibility, resulting in the following two-parameter definition.

  • An object  is called  d-deep with  s  bits significance iff every program for U to compute x in time <d is compressible by at least s bits. This formalizes the idea that every hypothesis for  x  to have originated more quickly than in time  d  contains  s bits worth of ad-hoc assumptions.

Dynamic and static resources, in the form of the parameters  d  and  s,  play complementary roles in this definition:  d  as the quantifier and  s  as the certifier of the object’s nontriviality.  Invoking the two parameters in this way not only stabilizes depth   with respect to small variations of  x and U, but also makes it possible to prove that depth obeys a slow growth law, without which any mathematically definition of organized complexity would seem problematic.

  • A fast deterministic process cannot convert shallow objects to deep ones, and a fast stochastic process can only do so with low probability.  (For details see Bennett88.)

 
Logical depth addresses many infelicities and problems associated with entropic measures of complexity.

  • It does not impose an arbitrary rate of exchange between the independent variables of strength of evidence and degree of nontriviality of what the evidence points to, nor an arbitrary maximum complexity that an object can have, relative to its size.  Just as a microscopic fossil can validate an arbitrarily long evolutionary process, so a small fragment of a large system, one that has evolved over a long time to a deep state, can contain evidence of entire depth of the large system, which may be more than exponential in the size of the fragment.
  • It helps explain the increase of complexity at early times and its decrease at late times by providing different mechanisms for these processes.  In figure 2, for example, depth increases steadily at first because it reflects the duration of the system’s actual history so far.  At late times, when the cellular automaton has run for a generic time comparable to its Poincare recurrence time, the state becomes shallow again, not because the actual history was uneventful, but because evidence of that history has become degraded to the point of statistical insignificance, allowing the final state to be generated quickly from a near-incompressible program that short-circuits the system’s actual history.
  • It helps explain while some systems, despite being far from thermal equilibrium, never self-organize.  For example in figure 1, the gaseous sun, unlike the solid earth, appears to lack means of remembering many details about its distant past.  Thus while it contains evidence of its age (e.g. in its hydrogen/helium ratio) almost all evidence of particular details of its past, e.g. the locations of sunspots, are probably obliterated fairly quickly by the sun’s hot, turbulent dynamics.  On the other hand, systems with less disruptive dynamics, like our earth, could continue increasing in depth for as long as their nonequilibrium boundary conditions persisted, up to an exponential maximum imposed by Poincare recurrence.
  • Finally, depth is robust with respect to transformations that greatly alter an object’s size and Kolmogorov complexity, and many other entropic quantities, provided the transformation leaves intact significant evidence of a nontrivial history. Even a small sample of the biosphere, such as a single DNA molecule, contains such evidence.  Mathematically speaking, the depth of a string x is not much altered by replicating it (like the bacteria in the flask), padding it with zeros or random digits, or passing it though a noisy channel (although the latter treatment decreases the significance parameter s).  If the whole history of the earth were derandomized, by substituting deterministic pseudorandom choices for all its stochastic accidents, the complex objects in this substitute world would have very little Kolmogorov complexity, yet their depth would be about the same as if they had resulted from a stochastic evolution.

The remaining infelicities of logical depth as a complexity measure are those afflicting computational complexity and algorithmic entropy theories generally.

  • Lack of tight lower bounds: because of open P=PSPACE question one cannot exhibit a system that provably generates depth more than polynomial in the space used.
  • Semicomputability:  deep objects can be proved deep (with exponential effort) but shallow ones can’t be proved shallow.  The semicomputability of depth, like that of Kolmogorov complexity, is an unavoidable consequence of the unsolvability of the halting problem.

The following observations can be made partially mitigating these infelicities.

  • Using the theory of cryptographically strong pseudorandom functions one can argue (if such functions exist) that deep objects can be produced efficiently, in time polynomial and space polylogarithmic in their depth, and indeed that they are produced efficiently by some physical processes.
  • Semicomputability does not render a complexity measure entirely useless. Even though a particular string cannot be proved shallow, and requires an exponential amount of effort to prove it deep, the depth-producing properties of stochastic processes can be established, assuming the existence of cryptographically strong pseudorandom functions. This parallels the fact that while no particular string can be proved to be algorithmically random (incompressible), it can be proved that the statistically random process of coin tossing produces algorithmically random strings with high probability.

 
Granting that a logically deep object is one plausibly requiring a lot of computation to produce, one can consider various related notions:

  • An object  y  is deep relative to  x  if all near-minimal sized programs for computing  y  from  x  are slow-running.  Two shallow objects may be deep relative to one another, for example a random string and its XOR with a deep string.
  • An object can be called cryptic if it is computationally difficult to obtain a near- minimal sized program for the object from the object itself, in other words if any near-minimal sized program for x is deep relative to x.  One-way functions, if they exist, can be used to define cryptic objects; for example, in a computationally secure but information theoretically insecure cryptosystem, plaintexts should be cryptic relative to their ciphertexts.
  • An object can be called ambitious if, when presented to a universal computer as input, it causes the computer to embark on a long but terminating computation. Such objects, though they determine a long computation, do not contain evidence of it actually having been done.  Indeed they may be shallow and even algorithmically random.
  • An object can be called wise if it is deep and a large and interesting family of other deep objects are shallow relative to it. Efficient oracles for hard problems, such as the characteristic function of an NP-complete set, or the characteristic set K of the halting problem, are examples of wise objects.  Interestingly, Chaitin’s omega is an exponentially more compact oracle for the halting problem than K is, but it is so inefficient to use that it is shallow and indeed algorithmically random.

Further details about these notions can be found in Bennett88.  K.W. Regan in Dick Lipton’s blog discusses the logical depth of Bill Gasarch’s recently discovered solutions to the 17-17 and 18×18 four-coloring problem
I close with some comments on the relation between organized complexity and thermal disequilibrium, which since the 19th century has been viewed as an important, perhaps essential, prerequisite for self-organization.   Broadly speaking, locally interacting systems at thermal equilibrium obey the Gibbs phase rule, and its generalization in which the set of independent parameters is enlarged to include not only intensive variables like temperature, pressure and magnetic field, but also all parameters of the system’s Hamiltonian, such as local coupling constants.   A consequence of the Gibbs phase rule is that for generic values of the independent parameters, i.e. at a generic point in the system’s phase diagram, only one phase is thermodynamically stable.  This means that if a system’s independent parameters are set to generic values, and the system is allowed to come to equilibrium, its structure will be that of this unique stable Gibbs phase, with spatially uniform properties and typically short-range correlations.   Thus for generic parameter values, when a system is allowed to relax to thermal equilibrium, it entirely forgets its initial condition and history and exists in a state whose structure can be adequately approximated by stochastically sampling the distribution of microstates characteristic of that stable Gibbs phase.  Dissipative systems—those whose dynamics is not microscopically reversible or whose boundary conditions prevent them from ever attaining thermal equilibrium—are exempt from the Gibbs phase rule for reasons discussed in BG85, and so are capable, other conditions being favorable, of producing structures of unbounded depth and complexity in the long time limit. For further discussion and a comparison of logical depth with other proposed measures of organized complexity, see B90.
 


[1] An elementary result of algorithmic information theory is that for any probability ensemble of bit strings (representing e.g. physical microstates), the ensemble’s Shannon entropy differs from the expectation of its members’ algorithmic entropy by at most of the number of bits required to describe a good approximation to the ensemble.

 
 

Randomized Governance

What if instead of electing our representatives in government, we simply chose them at random?
A new Rasmussen poll asked 1,000 likely voters exactly this question. Turns out, 43% thought that a random choice of people from the phonebook would do a better job than the current legislators, a plurality. Of course, these people were themselves chosen randomly from a phonebook, so I’m not sure they are entirely unbiased. 🙂
But why stop at the legislators? Why not just write random legislation using context-free grammars? We already have software that can automatically write scientific papers, so it doesn’t seem like a stretch. I guess that a lot of this random legislation would be better than SOPA.

A Federal Mandate for Open Science

Witness the birth of the Federal Research Public Access Act:

“The Federal Research Public Access Act will encourage broader collaboration among scholars in the scientific community by permitting widespread dissemination of research findings.  Promoting greater collaboration will inevitably lead to more innovative research outcomes and more effective solutions in the fields of biomedicine, energy, education, quantum information theory and health care.”

[Correction: it didn’t really mention quantum information theory—SF.]

You can read the full text of FRPAA here.
The bill states that any federal agency which budgets more than $100 million per year for funding external research must make that research available in a public online repository for free download now later than 6 months after the research has been published in a peer-reviewed journal.
This looks to me like a big step in the right direction for open science. Of course, it’s still just a bill, and needs to successfully navigate the Straights of the Republican-controlled House, through the Labyrinth of Committees and the Forest of Filibuster, and run the Gauntlet of Presidential Vetos. How can you help it survive this harrowing journey? Write your senators and your congresscritter today, and tell them that you support FRPAA and open science!
Hat tip to Robin Blume-Kohout.

Having it both ways

In one of Jorge Luis Borges’ historical fictions, an elderly Averroes, remarking on a misguided opinion of his youth,  says that to be free of an error it is well to have professed it oneself.  Something like this seems to have happened on a shorter time scale in the ArXiv, with last November’s The quantum state cannot be interpreted statistically  sharing two authors with this January’s The quantum state can be interpreted statistically.  The more recent paper explains that the two results are actually consistent because the later paper abandons the earlier paper’s assumption that independent preparations result in an ontic state of product form.  To us this seems an exceedingly natural assumption, since it is hard to see how inductive inference would work in a world where independent preparations did not result in independent states.  To their credit,  and unlike flip-flopping politicians, the authors do not advocate or defend their more recent position; they only assert that it is logically consistent.