2024 In Books

Total books read: 40 (see my goodreads profile for a full list)

Best book: When We Cease to Understand the World by Benjamín Labatut. My review:

This book is really going to bork my knowledge of the science history of Haber, Schwarzchild, Grothendieck, Heisenberg, and Schrödinger, but that’s ok. A dark fiction of nonfictions, it is a book of imagined tails of these giants on the edge of discovery and disaster.

Some will be offended by the male genius trope, but others might consider that these are not heroes, and that science on the edge is, by definition, madness. Most scientists never get to live on this boundary, spending their lives in the cold comfort of existing structures, and Labatut captures this terrifying boundary in dense dark prose.

Best book’s buddy book who was not nearly as good: The Maniac by Benjamín Labatut. Perhaps because I know a lot about von Neumann, and the surrounding characters, this book fell far short of When We Cease to Understand the World. A true masterpiece involving this particular Martian still awaits telling.

Most read author: Adrian Tchaikovsky. Cage of Souls, Service Model, Elder Race, and Alien Clay. I will continue to pick up most anything he writes.

Best non-fiction: Elements of Eloquence: How to Turn the Perfect English Phrase by Mark Forsyth. A romp though rhetorical grammatical structures, what they are and how they are used.

Non-fiction book I was most disappointed with: The Impossible Man: Roger Penrose and the Cost of Genius by Patchen Barrs. I had high expectations, Penrose is certainly one of the most fascinating theoretical physicist of the last century. Somewhere I have a xeroxed copy of his PhD thesis on spin networks. Sadly this book focuses more on the complicated relationships in his life, and less on the complicated physics and mathematics in his life.

Number of Neal Stephenson novels: 1 new (Polostan), 2 rereads (Snowcrash, Seveneves)

Books in progress: 4. I still have 12 hours to finish them!

  1. Neuromancer by William Gibson (reread)
  2. The Transmigration of Timothy Archer by Philip K. Dick (reread)
  3. Discovering : Inventing Solving Problems at the Frontiers of Scientific Knowledge by Robert Root-Bernstein
  4. Life as No One Knows It: The Physics of Life’s Emergence by Sara Imari Walker

Next years reading goals: More non-fiction in areas I know little about.

New Quantum Algorithm Dance: DQI Edition

To work on trying to find novel quantum algorithms is to take a long lonely trip where progress is measured in failures. Whenever one of the brave souls (or group of souls) embarks on such a journey, and comes back with treasure that looks like a new quantum algorithm, we should all do the quantum algorithm dance and celebrate the returning heroes. It looks to me like we have recently had just such an event with the appearance of the preprint “Optimization by Decoded Interferometry” by Stephen P. Jordan, Noah Shutty, Mary Wootters, Adam Zalcman, Alexander Schmidhuber, Robbie King, Sergei V. Isakov, and Ryan Babbush [arXiv:2408.08292]. They call this algorithm Decoded Quantum Interferometry, or DQI (not to be confused with the name of the quantum computing division of the American Physical Society, but yes it will be confused). I was lucky enough to sit in the same office as Stephen, so I got to watch as he and the team made the discovery of the algorithm. The optimizer has blogged about the results of this algorithm, but I’m not a theoretical computer scientist, so what I’m most interested in is “how does it work”. In other words, what is the actual quantum algorithm, what does it actually do?

The general problem space that the DQI works on is optimization. In optimization problems one wants to maximize some function over a combinatorialy large domain. That is for a function $f$ one wants to find an $x$ such that $f(x) \geq f(y)$ for all $y \neq x$. Generally finding the maximum is very hard. If I give you $f$ as a black box, i.e. I don’t tell you how $f$ was generated, it’s very clear that one might have to search in the worst case over the entire space of inputs to the function. More practically, however, we are often given some succinct description of $f$, for example it might be a mathematical function like a polynomial , and we can ask how hard it is to find the maximum given this description. It turns out though that even then the problem is hard, this leads us to complexity classes like NP and friends. Even more practically, though, one could also loosen the requirement a bit, what if one wants to find a value $x$ such that $f(x)$ is as close the maximum as possible. It turns out that when one does this, there are certain regimes where wouldn’t be crazy for quantum algorithms to outperform classical algorithms. This is the target of the DQI algorithm.

Let’s “derive” DQI.

If you have a function $f$ you want to optimize, one thing you might think to do on a quantum computer is to prepare the state $$|f\rangle := \frac{1}{\sqrt{\mathcal F}} \sum_x f(x) |x\rangle$$The sum is over some large space of inputs to the function, for example we could imagine that the sum is over bitstrings of length $n$ (so the sum is over $2^n$ terms). For simplicity we assume the values $f$ produces are real numbers. Here ${\mathcal F}$ is a normalization factor. If one could produce this state, then the probability of observing $x$ if measuring this state is $$prob(x) = \frac{ |f(x)|^2}{\mathcal F}$$This doesn’t seem too impressive, but lets stick with it. In particular, we do notice that the probability of producing $x$ is higher where $|f(x)|$ is higher, so producing such a state gives us the ability to bias towards the values that maximize or minimize $f$.

We want to produce a state like $|f\rangle$ and we are using a quantum computer, so somehow we want to arrange it so that the quantum computer directs positive interference like effects to the places where $f$ is higher and negative interference like effect s to the places where $f$ is lower. We don’t have a lot of interfering type matrices in quantum (or rather all generic unitaries are like this, but structured unitaries that we can get a handle on are rarer), so lets just assume that the unitary which produces this is the $n$ qubit Hamadard transform. That is its own inverse, so we can say that if we start in the state $$|\hat{f}\rangle := \frac{1}{\sqrt{\mathcal F} 2^{n/2}} \sum_{x,y \in {\mathbb Z}_2^n} (-1)^{x \cdot y} f(x) |y\rangle$$where $x \cdot y = x_1 y_1 + \dots + x_n y_n ~{\rm mod}~2$, then applying $H^{\otimes n}$ to this state will produce the desired $|f\rangle$. We’ve changed the problem of preparing $|f\rangle$ to preparing another state, $|\hat{f}\rangle$.

What to do next? First stare at this state. What could the $f(x)$ possibly be that would make $|\hat{f}\rangle$ easy to prepare? Well one obvious thing would be if we had $$f(x) = (-1)^{b \cdot x}$$where $b \in {\mathbb Z}_2^n$. In that case, $|\hat{f}\rangle = |b\rangle$. In other words, if we can prepare $|b\rangle$ (which is trivial), then we can create $|f\rangle$ with $f(x) = (-1)^{b \cdot x}$. Now this isn’t to exciting of a function to maximize. And even worse when we prepare $|f\rangle$ since all the values of $f$ are $\pm 1$, and measure it we get all bit strings with equal probability. Boo.

But like all things in algorithm research, lets keep chugging along. Since $|b\rangle$ gave us an $f$ we could analyze, a natural thing to think about is what if you could produce a superpostion of inputs. If we start with $\frac{1}{\sqrt{2}}(|b_1\rangle + |b_2\rangle)$, where $b_i \in {\mathbb Z}_2^n$, then we see that we can produce $|f\rangle$ with $$f(x) = (-1)^{b_1 \cdot x} + (-1)^{b_2 \cdot x}$$Now this state is a bit more interesting, in that it doesn’t have an equal probability when we measure in the computational basis. Sometimes $b_1 \cdot x = b_2 \cdot x$ in which case the amplitudes add up, and other times $b_1 \cdot x \neq b_2 \cdot x$ in which case the amplitudes cancel.

Another observation is that we could change the values of which amplitudes cancel and which add, by instead of preparing $\frac{1}{\sqrt{2}}(|b_1\rangle + |b_2\rangle)$ instead preparing $\frac{1}{\sqrt{2}}((-1)^{v_1}|b_1\rangle + (-1)^{v_2}|b_2\rangle)$ where $v_i \in {\mathbb Z}_2$. Now the function is $$f(x) = (-1)^{v_1 + b_1 \cdot x} + (-1)^{v_2+b_2 \cdot x}$$.

Generaliizing a bit more, if we can prepare the state $$\frac{1}{\sqrt{m}} \sum_{j=1}^{m} (-1)^{v_j} |b_j\rangle$$ then we produce a $|f\rangle$ state with $$f(x) = \sum_{j=1}^m (-1)^{b_j \cdot x + v_j}.$$ Note that the way we have written things we can essentially ignore normalzation factors, because the final normalization factor ${\mathcal F}$ in our definition takes care of things. For instance we could have written $f(x) = \alpha \sum_{j=1}^m (-1)^{b_j \cdot x + v_j}$ for some non-zero $\alpha$ and this would also correspond to the same state, ${\mathcal F}$ takes care of the proper overall normalization.

Great! But where have we gotten ourselves. Well one thing we can note is that the $f$ we have produced has a nice interpretation. Consider the set of $m$ linear equations mod-$2$ $$\begin{align}b_1 \cdot x = v_1 \\ b_2 \cdot x = v_2 \\ \dots \quad \\ b_m \cdot x = v_m \end{align}$$Then $f(x) = \sum_{j=1}^m (-1)^{b_j \cdot x + v_j}$ is directly proportional to a count of how many of these simultaneous mod-$2$ linear equations we can satisfy. This should feel to you more like a traditional optimization problem. In particular it might remind you of $3$SAT. In $3$SAT one is given a set of $m$ clauses, each of clause being a boolean expression which is made up of an OR of 3 boolean variables or their negations. The goal of the $3$SAT problem is to find if there is a satisfying assignment where each clause evaluates to true. The optimization version of this problem is to find the a value of the boolean variables that maximizes the number of clauses that evaluate to TRUE. This is called max-$3$SAT The problem we have reduced to is like this, except the clause are now mod-$2$ equations. For this reason this problem is called max-XORSAT.

But ok, now what have we really shown? There are objections. The first objection is that we haven’t talked about the complexity of preparing the state $\frac{1}{\sqrt{m}} \sum_{j=1}^{m} (-1)^{v_j} |b_j\rangle$. The second objection is that we only see the maximum $x$ with probability $|f(x)|^2$, is this enough to figure out $x$ that maximizes $f(x)$ or that gives an $f(x)$ that is a good approximation to the maximal value?

Let’s talk about the second problem first. What if, instead of preparing $|f\rangle$ we instead try to prepare $$|f^2\rangle := \frac{1}{\sqrt{{\mathcal F}_2}} \sum_x f(x)^2 |x\rangle.$$ Here ${\mathcal F}_2$ is normalization factor to make the state normalized. This then will result in $x$ with probability proportional to $|f(x)|^4$, i.e. we will have even greater probability of seeing the maximal value.

Going with this idea, lets check what happens when we apply $H^{\otimes n}$ to $|f^2\rangle$ just as we did for $|f\rangle$ (so we can figure out what to prepare to be able to produce $|f^2\rangle$ by $n$-qubit Hadmard-ing). Let’s use the $f(x)$ we described above for max-XORSAT. Then $$f(x)^2 = \left ( \sum_{j=1}^m (-1)^{v_j +\sum_{i=1}^n b_j \cdot x} \right)^2= \sum_{j_1,j_2=1}^m (-1)^{v_{j_1} +v_{j_2} + \sum_{i=1}^n (b_{j_1} + b_{j_2}) \cdot x}.$$ We can split this sum into two terms, and using the fact that for $z\in {\mathbb Z}_2$ $z^2=0~{\rm mod}~2$ we can express this as $$f(x)^2= m + \sum_{j_1 \neq j_2 = 1}^m (-1)^{v_{j_1} +v_{j_2} + \sum_{i=1}^n (b_{j_1} + b_{j_2}) \cdot x}$$

From this we can now calculate what the state is that, if we Hadamard it, we end up with $|f^2\rangle$. This is $$|\hat{f^2}\rangle=\frac{1}{ 2^{n/2} \sqrt{ {\mathcal F}_2}}\sum_{y \in {\mathbb Z}_2^n} f(x)^2 (-1)^{x \cdot y} | y\rangle.$$ We can individually calculate the Fourier transform of the two terms in our expression for $f(x)^2$ above, but we do need to be careful to note the relative normalization between these terms. $$\alpha \left(m|0\rangle + \sum_{j_1 \neq j_2=1}^{m} (-1)^{v_{j_1+j_2}} |b_{j_1}+b_{j_2}\rangle \right)$$where the addition in the ket is done bitwise modulo $2$.

Great, so we’ve….gotten ourselves a new set of states we want to be able to produces. We can be a great lyric writer and make a list $$\begin{align} &|0\rangle \\ \sum_{j=1}^{m} &(-1)^{v_i} |b_i\rangle \\ \sum_{j_1,j_2=1}^{m} & (-1)^{v_{j_1}+v_{j_2}}|b_{j_1}+b_{j_2}\rangle \end{align}$$ Our list is of non-normalized states. How might we prepare such states (and superpositions of such states)? To do this we recall the inverse trick. Suppose you you can compute the function $g(x)$ from $k$ bits to $l$ bits and also the inverse $g^{-1}(x)$ from $l$ bits to $k$ bits. We further assume that we are working over a domain where we the inverse exist. In other words suppose you have quantum circuits that act as $U_g|x\rangle |y\rangle = |x\rangle |y\oplus g(x)\rangle$ and $U_{g^{-1}} |x\rangle|y\rangle = |x\rangle |y \oplus g^{-1}(x)\rangle$ where $\oplus$ is bitwise addition and the first register has $k$ bits and the second $l$ bits. Then if one starts with a superposition over $|x\rangle$ then one can convert it to a superposition over $g$ evaluated at the points in the superposition. i.e. $$U_{g^{-1}}^\updownarrow U_g \left( \frac{1}{\sqrt{N}} \sum_x |x\rangle \right) |0\rangle = |0\rangle \left( \frac{1}{\sqrt{N}} \sum_x |f(x)\rangle \right)$$ Here $\updownarrow$ means that we apply the $U$ with the registers flipped. This is the inverse trick: if we can efficiently convert the function and its inverse, then we can easily go from a superposition over inputs to a superposition over the function applied to these inputs.

Let’s now return to our list (good thing we made a list). The first state $|0\rangle$ is pretty easy to make. Lets look at the second and third states without the phase term $\sum_{j=1}^m |b_i\rangle$ and $\sum_{j_1,j_2=1}^m |b_{j_1}+b_{j_2}\rangle$. Now comes the trick that gives us the title. Take the $b_i$s and put them each as rows into a matrix. This matrix has $m$ rows and $n$ columns. We unimaginatively call this $m$ by $n$ matrix $B$. We can think about this as the parity check matrix for a linear error correcting code. In particular we can think about this as a code where we encode into $m$ bits. We haven’t said anything about the rate of this code, i.e. how many bits we are attempting to code into or the distance of the code. But given $m$ bits we can apply $B$ to these bits and we will produce a syndrome. The syndrome are the bits we can read out which, for a good code, will allow us to determine exactly what error occurred to information we encoded into the code.

Under this microscope, that $B$ is a syndrome matrix for a linear error correcting code, what are the states in our list. First of all $|0\rangle$ is always a codeword in a linear code. But now look at $\sum_{j=1}^m |b_i\rangle$. Each of these is the syndrome for code if there was exactly one bit flip error. And if we look at $\sum_{j_1,j_2=1}^m |b_{j_1}+b_{j_2}\rangle$, then this is the syndrome for the code if there was exactly two bit flip errors. Return now to the inverse trick. Suppose that we prepare all single qubit errors, $$\frac{1}{\sqrt{m}} (|100\dots00\rangle + |010\dots00\rangle+ \cdots + |000\dots01\rangle$$or written another way $\sum_{x \in {\mathbb Z}_2^m, |x|=1} |x\rangle$ where $|x|$ here is the hamming weight (count of the number of $1$s in $x$). And further suppose that $B$ is a code which can correctly and efficiently correct single bit errors. Then we can use the inverse trick to turn this state in to $\sum_{j=1}^m |b_i\rangle$. Similarly if $B$ is a code that can correct two bit flips errors, then we can take a superpostion over all binary strings with two $1$s and using the inverse trick convert that into $\sum_{j_1,j_2=1}^m |b_{j_1}+b_{j_2}\rangle$ which are the syndromes for all two bit errors on the code.

OK so what do we have. Suppose the $B$ is the syndrome check matrix for an $m$ bit code and it has $n$ bit syndromes. Further suppose that this code has a distance $d$ so that it can correct $t=\lfloor (d-1)/2 \rfloor$ bit flips and that we can, in polynomial time classically decode $t$ errors. Then we can take superpositions of bit strings with $t$ $1$s and $m-t$ $0$s and using the inverse trick, we can produce all superpositions over error syndromes corresponding to $t$ errors. Ignoring normalization this is $$ \sum_{x \in {\mathbb Z}_2^m, |x|=t}|x\rangle \rightarrow \sum_{j_1 \neq j_2 \neq \cdots \neq j_t} |b_{j_1}+b_{j_2}+\dots+b_{j_t}\rangle.$$

OK so now we need to prepare the states $\sum_{x \in {\mathbb Z}_2^m, |x|=t}|x\rangle$. These are so called Dicke states, which have been studied first in quantum optics, and then in a variety of other settings in quantum computing. It turns out that they are easy to prepare quantum mechanically, for a recent paper on knowledge about how to do this see “Short-Depth Circuits for Dicke State Preparation” by Andreas Bärtschi and Stephan Eidenbenz [arXiv:2207.09998] (a not so great way to do this is to use the inverse Schur transform, do we get to finally say that the Schur transform is useful for quantum algorithms now? If yes, I think John Preskill owes me a beer?)

So we’ve seen that, if we interpret $B$ as a syndrome matrix, then we can take Dicke states and cover them to superpositions over code words with a given number of errors. It’s pretty easy to also see that we can apply the phases, corresponding to the $(-1)^{v_j}$ terms for the states we want to prepare as well, these will just be $Z$ gates applied to the Dicke states where we only apply $Z$ where $v_j=1$. Finally we notice that the state we needed to prepare for $|\hat{f}^2\rangle$ was a superposition of two different numbers of errors, $0$ errors or $2$ errors. To handle this we note that when decoding the syndrome, we can also calculate the number of errors. So if we started with $ \alpha |0\rangle + \beta|2\rangle$, and then conditionally create the appropriate Dicke states in separate register,$$\alpha|0\rangle |D_0\rangle + \beta |2\rangle |D_2\rangle$$ where $|D_k\rangle$ is the appropriate Dicke state, then when we perform the uncompute step we can also uncompte that first register, and end up in the appropriate superposition of syndomes with the correct normalization (the $\alpha$ and $\beta$ we want).

Pretty cool! While we’ve just worked out the simple case of $|f^2\rangle$, we can see how this generalizes. In particular for a given $B$ it will have a distance $d$ and can correct $t=\lfloor (d-1)/2 \rfloor$ bit flips. Then we can produce the state $|f^l\rangle$ for $l \leq t$ by generalizing the above steps and assuming that we can efficiently decode up to $l$ bit errors in the code. The way we do this is to first analytically express $f^l$ as a sum over terms which are sums over sums over syndromes with $0, 1, \dots, l$ errors. This in general will give us some $w_k$ weights that we explicitly know. We then start by preparing $\sum_{k=0}^l w_k |k\rangle$. We then append an $m$ bit register and conditionally create the Dicke states with the first register number of $1$s, $$\sum_{k=0}^{l} w_k |k\rangle |D_k\rangle.$$ Then we apply the appropriate $Z$s to the bits where $v_i=1$. Finally using the decoder which take the syndrome and creates the error as well as its weight we can use the inverse trick to clear the first two registers.

The above is a general description of what the DQI algorithm does, i.e. the quantum part. But how well does it work and does it solve max-XORSAT more efficiently than the best classical algorithms? At this point I should tell you that I’ve lied. There are a couple of ways I have lied. The first is that in general instead of preparing a state with $f^l$ the DQI paper describes making an arbitrary degree $l$ polynomial. This is useful because it allows one to identify an “optimal” polynomial. The second lie is a bit bigger, in that there is another problem that is central in the DQI paper and the is the max-LINSAT problem. While max-XORSAT has clauses that are made up of mod-$2$ equation, the max-LINSAT problem generalizes this. Instead of using ${\mathbb Z}_2$ this problem uses ${\mathbb Z}_p$ where $p$ is a prime (not exponentially big in problem size, so small). One can then look at codes over ${\mathbb Z}_p$. If we let $f_j$ denote function from ${\mathbb Z}_p$ to $\pm 1$, then the function you want to maximize in the max-LINSAT problem is $$f(x)=\sum_{j=1}^m f_i( b_j \cdot x)$$ where $b_j \in {\mathbb Z}_p^n$ and the term inside the parenthesis is done with mod-$p$ arithmetic. The paper describes how to produce the $|f^l\rangle$ (and similarly for arbitrary polynomial) state. Again the main trick is to consider the decoding problem for the code with syndrome matrix made up of the $b_j$s, but now over ${\mathbb Z}_p$ instead of ${\mathbb Z}_2$.

Given these two lies have been corrected, what can we say about how well this algorithm works? If in the max-LINSAT problem one has that $r$ of the $p$ possible values for each $f_i$ map to $+1$ (and the other $p-r$ map to $-1$), then the authors are able to show a very nice asymptotic formula. Recall the $m$ is the number of clauses, $p$ is the base fields, $r$ is the just defined amount of bias in the $f_i$ towards $+1$, and $l$ is the weight to which you can efficiently decode the code for. Then the expected number of clauses satisfied when $m \rightarrow \infty$ follows a beautiful formula, which the authors call the semicircle law $$ \frac{\langle s \rangle}{m} = \left( \sqrt{\frac{l}{m} \left( 1 – \frac{r}{p} \right)} + \sqrt{\frac{r}{p} \left( 1 – \frac{l}{m} \right) } \right)^2$$

Great, so there is a good understanding of the performance of the algorithm, how does it compare to the classical world. Here one needs to define the parameters that one is looking at, and how the problems are generated. Here there are two arenas that the authors look at. The first is for the max-XORSAT case. In that case, the authors are able to show that the quantum algorithm provides a higher number of satisfied clauses than simulated annealing for random sparse instances of the problem. That’s great, simulated annealing is a top performing optimization strategy and even getting in the same ballpark for it is very hard. But in this regime it turns out that the authors were also too clever and found a classical heuristic that can outperform DQI for the same regime. Note that this isn’t the end of the story, it may be that there are regimes where the DQI does outperform all classical polynomial time algorithms. And in all fairness, the DQI result is a provable number of clauses, whereas simulated annealing is a heuristic, the authors have tried to set a high bar!

Instead of looking at random instances, one could also look at more structured cases. Here the authors show that if one uses the Reed-Solomon code then the max-LINSAT problem becomes equivalent to another problem, the Optimal Polynomial Intersection problem, and in this case DQI beats all known classical algorithms for this problem. The Optimal Polynomial Intersection problem is as follows. Suppose you are given $p-1$ subsets of ${\mathbb Z}_p$, $F_1, F_2, \dots, F_{p-1}$. Then your goal is to find a $n-1$ degree polynomial over ${\mathbb Z}_p$ that maximizes the number of intersections with these subsets, i.e. if $Q$ is the polynomial maximize the number of times $Q(1) \in F_1, Q(2) \in F_2, \dots$. This problem has been studied before in cryptographic literature and if one converts DQI for Reed-Solomon to this problem it seems that the quantum algorithm can produce more intersecting regions than the best known classical algorithm. The margin is actually quite high, for instance if the ration of $m$ over $n$ is $10$ the quantum can produce $0.7179\dots$ expected intersections, whereas the best classical produced $0.55$ expected intersections. Exciting!

Hey mom, thanks for reading this far! Everyone stand up and do the quantum algorithms dance!

But more seriously, definitely check out the paper. There are nice connections between this work and prior work, two that stand out are works over lattices from Regev, Aharonov, and Ta-Shma and then more recent work by Yamakawa, Zhandy, Tillich, and Chailloux. What I like most about the work is that it feels very different than the more traditional places where we have seen “exponential” speedups (those quotes are because we do not know how to prove much real separations, quantum or classically, for all we known factoring could be in classical polynomial time. The quotes don’t detract from what is shown here, which is the gold standard of what people these days call “provable” exponential advantage. I personally find that nomenclature off because it puts the word “provable” too close to “exponential”. YMMV if you aren’t a grumpy old dude like me). Another part I like is that it shows the power of creating a superposition over structured objects. We know, for instance, that being able to prepare the superposition over all of the permutations of a graph would lead to a polynomial time quantum algorithm for graph isomorphism. We don’t know how to do that preparation, whereas here we do know how to produce produce the superposition.

And finally I like this problem because it has a nice sort of structure for thinking of new algorithms. One can first look at other codes, and there is the nice semicircle law to guide you. Another thing one can do is think about how this generalizes to other unitaries beyond the Fourier transform, what are these optimizing for? And finally one can also try the general method of taking an objective function and performing work on this function (squaring, convolving, etc) and think if maybe that can yield useful new quantum algorithms. Lot’s to work with and I’m excited to see where this all goes (and if the classical world can catch up to the quantum for the Optimal Polynomial Intersection problem. Go quantum!)

p.s. Stephen Jordan is talking at the excellent Simons Institute Quantum Colloquium tomorrow about DQI. If you aren’t following these colloquiums, you should! They are all most excellent and they include a panel afterwards with experts who then talk about the talk. This captures the feeling of a real conference, where a lot of the best part is the follow up conversation in the hallways.

Requiem for the Living Computer Museum

This summer my fourteen year old son’s schedule for the first month off school has been no schedule at all. When I get home each day I ask him whether he has done his job, “are you bored?” I come home to find notebooks with diagrams of fencing strategy, a Google document with detailed instructions on how to optimize play in Hearts of Iron IV, and a smattering of my own books pulled from the shelves and spread on the bed. I smile because I am happy that he has achieved the goal. He has found boredom.

In 2020, when the pandemic hit, the Living Computer Museum in Seattle closed its doors. Just two years earlier its benefactor, Microsoft cofounder Paul Allen, had passed away. And this June of 2024, the estate of Paul Allen announced that the Living Computer Museum was closing doors forever and much of the museum’s collection sold at auction.

This permanent closing has made me profoundly sad.

The Living Computer Museum was multiple floors of computing history. The lowest exhibit floor felt like a traditional museum, with different exhibits of computer technology, an old Cray 1, a VR exhibit, etc, but with more interactive exhibits than one would normally associate with a museum (modular robots you could assemble to exhibit strange behaviors). This floor also included a retro arcade and a mock up of a living room circa 1980 with a console hook up to an old tv. But the top floor was the gem, and really the place I mourn.

On the top floor of the Living Computer Museum was a huge collection of computers spanning nearly the entire history of computers from mainframes to modern PCs, and these were actually plugged in and working and available to use. You could just sit down, for example, at a legendary Xerox Alto, and just start playing around with it. Here is my son, playing Karateka on an Apple IIc

Back when I was a physicist, one of the benefits of the occupation was the crazy crackpot letters or emails you would get explaining why physics as we know it was all wrong. THE UNCERTAINTY PRINCIPLE IS WRONG! They would shout at you, these retired doctors and engineers (always those professions), and it was sometimes fun to see the level of speculative craziness that one could achieve when ego dominates self reflection.

My friend Michael Nielsen once gave me one of the crackpot correspondences he had received—as the author of a blockbuster quantum computing textbook he must have received numerous such crackspondences—and one of the phrases on the postcard he gave me has always stuck with me. The phrase was “Dewey-eyed pastoralism” and the author was using it as a negative screed. The pastoralism the author was railing about was not animal husbandry, but the pastoralism of literature and art, a form in which the old country life is idolized. And the Dewey here was most certainly the pragmatic philosopher John Dewey, and used here as a negative adjective, Dewey was and is a fine enemy for the religious, his pragmatism was a real pain in the butt for wishful thinkers.

When I think back on my childhood, I always remember this phrase because I worry that I see it through “Dewey-eye pastoralism”. I grew up rural in the 1980s and was an unabashed computer geek. There was no internet, and, to the best of my knowledge there weren’t even any local BBSs in my small town. To be not just interested in, but trying to do things like build artificial life (I reversed engineered Langton’s loops from a few pictures on the back of a book), on your computer, in small town America, at that time made you incredibly strange and odd. But it was a unique experience in my life, and the more I think about it the more I remember how much I was given the opportunity to wander and explore all on my own, and what an incredible gift this has been.

I read all the computer books in the local library (maybe 15 of them in total), and when we would go an hour away to the nearest big bookstore I would stand in the computer section and try to memorize the Apple Mac computer manuals (even though I owned an Apple IIGS, some of the library structures were close enough). A major discovery was the entire backlog of Scientific American’s and the Mathematical Recreations and Computer Recreations column. Twice I got to go away to the University of Oregon to Computer Camp where they had toaster Macs, and I saw one of them catch on fire (years later I would get a chat on LinkedIn from someone inquiring if I went to that camp, it turns out he had stayed in computers and helped founded one of the major internet real estate platforms.) When I think back on this I remember how important it was that I could just sit and goof around with the computer. I remember how obsessed I was with graphics and every machine I encountered for years (including my calculator) I would write the Mandelbrot set as a test of its speed and color capabilities. As I said, it’s hard not to think back on these days with some major doses of pastoralism.

The Living Computer Museum was the closest thing I’ve found where I could relive my childhood experience. I’m glad I got to share the experience of what it was like to be a 1985 computer nerd with my family and friends. But I’m also sad, because I don’t think we have enough places where you can just sit around and goof off with these simple machines. Today’s computers are beasts, immensely capable, but that capability makes them constant distraction machines. There is something about the minimalism of those old computers that, in my head, makes me think I can more easily commune with them. And it was also a place to connect to the society as it existed before everyone wanted to be a tech nerd, money flowed from the internet pipes, and we sit through endless hype cycles of breathless FOMO. So today I mourn for my favorite museum, curse a little at Paul Allen for not setting the museum up for the future, and wonder what is going on today for which my son will look back on it with a little bit of Dewey-eyed pastoralism.

THE UNIVERSE AS A COMPUTER, John Archibald Wheeler, Part 2

In my last post I described a list created by the late physicist John Archibald Wheeler describing ways in which the universe could be like a computer. This came from a note in the collection of papers from the American Philosophical Society. In the second part of this note, Wheeler moved on to describing what the possible implications of the metaphor could be. I’ve now transcribed these and thought people would be interested in seeing this list as well.

THE UNIVERSE AS A COMPUTER

Possible Implications of the Metaphor

NOTE: The prior and companion list ”The Universe As A Computer: Possible Meanings of the Metaphor”–which should also be consulted–dealt with the metaphor in a more preliminary, general, and abstract way. Here there is an effort to list the more specific, concrete mechanisms that may permit important analogies between computers and universes-~apart from items that would be largely redundant if added to this second list (apart, in turn, from oversights that are inevitable in this unedited list, and items whose redundancy is initially uncertain). Some items here are apt to represent particular illustrative or applied examples, or sets thereof, rather than unique conceptual taxons restricted to, and precised and exhausted by, single items.

1. Physical phenomena and entities may have spontaneous or reactive output effects that occasionally, often, generally, or always elicit or modify one or more instances, modes, or kinds (contents) of direct or indirect feedbacks from near, distant, and/or all (pancosmic) entities, phenomena, events, or environments–of a like or different nature–in one or more or infinitely many cycles, phases, epochs, or progressions.

2. The former (see #1) may or may not be essential to the nature of the original physical phenomena and entities; that which is essential to the latter may be fixed substance or pattern, dynamic substance or pattern, dynamic interaction, dynamic transaction, ever-changing ever-novel modes thereof, or all such things combined.

3. Such feedback (see ## 1, 2) may be stabilizing and/or differentiating.

4. Such feedback (see ## 1, 2, 3) may be to the physical phenomena and entities, other feedback(s), and/or itself en route.

5. Such feedback (see ## 1, 2, 3, 4) may be (in addition to #3)–or have roles or elements that include–positive and negative feedback, alternation, correction, redirection, coupling, transportation, partitioning, rate control, memory, simplification, integration, comparing, measurement, intermediation and message-passing, path-finding, road maintenance, consolidation, sampling, &c.

6. Some or all parts of the physical universe may govern one another in a rotating system, that is fixed and/or random in various ways and degrees.

7. In some way the universe may resemble a representative democracy, with fixed or changing particular or general representatives, arranged in a finite or infinite hierarchy, with government descending and/or ascending in various degrees, ways, and with respect to various matters.

8. The feedback proposed above (see ## 1-5) may form various finite or  infinite–and essential or nonessential–hierarchies, networks, series, channels, matrices, equilibria, disequilibria, phenomena,  systems, logics, ‘machines’, dimensions, clusters in maps, ‘stories’, lattices, equations, number systems, mathematical groups, categories, or analogs thereto, codes, &c.

9. Nature may everywhere be using simple arithmetic~-or other extraordinarily simple or universal processes–or may be characterizable by such to a wholly unexpected degree.

10. All of history may somehow represent a memory that is intact and functioning in the present.

11. Analogously (see #10), all of the future may somehow represent a plan  that is exhaustively immanent and active in the present.

12. All that a physicist is or does may somehow be–truthfully or pragmatically–reduced to a computer or computation.

13. The difference between the analogue and digital computer may not be sharp but rather represent a continuous gradation or a definitional or ultimate fiction.

14. The universe may approximate to a computer.

15. In an ultimate sense the universe may not be a computer, but the form of physics we have now, or that will emerge in the immediate future, may in essence treat it as such.

16. Representing the universe as computer-like, in physics and other natural sciences or science in general, might be a more–or the most–efficient, fertile, clever, unified, powerful, rational, or practical way of investigating, understanding, and exploiting it.

17. Much that we think of as essential and important in the present language, form, methods, philosophy, conceptual apparatus, mathematics, and overall organization of physics–even its ideals or standards of proof, rigor, consistency, elegance, simplicity, uniquity, universality, fundamentality, necessity, repetition, conceptuality, meaning, importance, finality, objectivity, completeness, predictive power, Boolean logic, &c–may not be, or may not be essential or important to physics as it could be, or as it should and will be in the future (perhaps thanks to the computer, as that which aids or supersedes the human scientist); that which is really essential and important in or to physics may be very different than we now suppose.

18. What goes on in and constitutes Nature may be much more steplike than we now know or suppose: the density, intricacy, and breadth of steps may be far greater, the steps may be far more discrete, individualized, important, coordinated, lawful, and elegant, phenomena and Nature as a whole may be much more concatenated, sequential, plexiform, stringy or I-dimensional, anastomotic, self-entraining, modular and nodal, unidirectional, irreversible, delomorphous, Functional, irredundant, operational, machinelike, constrained, local, tiny or microphysical, quantized, systemic, produced by massive past experimentation, selection, and evolution, organismic, organic, &c, the ‘wheel of time may have far more teeth’, Nature may be far more ruled, methodic, and nomocratic, &c.

19. The universe may everywhere be filled with local operations or activities that never cease and that are all simultaneously important~-perhaps locally, perhaps universally. 

20. Modern statistical cosmology may be a necessary early approximation that will ultimately yield to a complete cosmology wherefor the universe is essentially and irreducibly an infinitely complex machine that, qua infinitely complex, transcends the very concept of machine and of a mechanistic universe. 

21. The presumption of current physics that its laws, theories, methods, and concepts actually specify, describe, explain, or are equivalent –in any final, realistic, or sufficient way–to the detailed, peculiar, and total phenomena that fill all size scales in Nature as a single, indivisible, necessary, and solid plenum, may be a fantastic oversimplification. The truth may more nearly be that each and every phenomenon and event must speak for itself–that today’s physics offers only the crudest approximation to anything.

22. All physical phenomena locally, and the universe as a whole, may basically represent a process of counting.

23. What all interactions of physical phenomena may basically represent is a process of mutual description.

24. The universe may reduce to games being played between physical phenomena–games computable by or similar to a computer.

25. The role of the genes and genome in the reproduction, ontogeny, maintenance, life, and evolution of organisms–the control by the genotype of the phenotype–is essentially computer-like: and it may be that all physical phenomena, and the universe as a whole, are somehow controlled by analogous means and therefore also computer-like or computational. (In many instances or in general, the controlling system may be purely dynamical and hence for all purposes invisible, immaterial, and transcendental!)

26. The totality of physical phenomena constituting the universe may all–everywhere and incessantly–be evolutionary and coevolutionary.

27. All physical phenomena (in partial analogy to #25)–pace modern physics–may be largely or exquisitely controlled by a single, relatively or infinitely small, singularity-like internal locus, part, structure, dynamical program, law, process, idea, and/or the like exercising autocratic or mentorial powers.

28. Nature, by analogy with what goes on in a computer, may contain what could be described as ‘blocks of information, controls, or programs  that enable phenomena to control other phenomena, or at least to alter their function or influence the world as a whole, that may move about between phenomena and live a semi-independent existence, and that man could use as tools to decipher, control, and transform particular phenomena and phenomena in general.

29. The universe may represent information, forms of information, descriptions of itself, representations, or perspectives that are all competing with one another to emerge or dominate.

30. All parts of the universe may be competing to become the whole or to absorb one another.

31. The universe may represent a grid, tape, manifold, circuit, and/or the like that is curved, finite, closed, recursive, self-generating, self-exciting, teleological, circumplexed, and/or the like with respect to space, time, energy, matter, causes, effects, information, laws, and/or the like.

32. The universe may represent patterns of change or dynamical patterns, exponential processes, equations, types of order, states, morphogenetic tendencies, and/or the like that are all simultaneously competing to annihilate one another, originate, become maximally real or ‘probable’, become omnipotent or infinite, become ubiquitous or eternal, pullulate, interfere with one another, and/or the like.

33. Like a computer, the universe may contain null or zero elements that are active and important.

34. The universe may be creating time by gradually computing the future.

35. The universe may be like a computer operating on the fabric of the present to modify it and thereby create or synthesize the future.

36. The universe may be completely stoichiometric–perfectly conserving information, change, order, motion, states, possibilities, and/or the like.

37. The universe may be ‘nomogenetic’–constantly generating, discovering, and acquiring by transformation new, novel, more powerful, and more numerous laws and rules.

38. Per contra what the laws of thermodynamics seem to imply, usable energy and net order in the universe may be perfectly conserved rather than inevitably tending to decrease with time, and the universe may be equivalent to perpetual motion.

39. The universe may contain an indefinite amount of things that are hidden, just as a computer may hide its contents.

40. The universe may represent a computer whose program is unknowingly being  written, or rewritten, by the physicists themselves or by human minds in general.

41. Mastery of the flow and possibilities of physical information on the quantum scale, or in matter at all scales simultaneously, may equip man with the highest possible form of technology or mean that he has at last triumphed over Nature; and the secret to such mastery may be to treat the universe–scientifically or technologically–as a computer.

42. It may be possible to invent, make, and exploit cellular automata that are cosmoplastic and cosmopoietic (that can automatically change the entire universe or create new universes), the natural preexistence of such automata may be unavoidable (in which case Nature herself may be entirely artificial, and physics must reinvestigate the universe from the new point of view), and/or the power of such automata to effectively transform the content of the universe by simply redefining its form or by sampling it in a certain way to an infinite degree may mean that the universe is equivalent to a myriorama (a thing that gives the appearance of being a single universe, but in fact contains all possible universes and permits their efficient substitution).

43. The main lesson from the computer for today’s physicist may be that the universe is not just a thing that sits out there–simple, objective, absolute, independent of the observer, static, the manifestation of a few universal laws–but a complex construction, and a kaleidoscope composed entirely of active processes and interactive possibilities that call for a new physics telling the physicist how to do things and how to co-operate with, and as, the universe in a new and higher form.

THE UNIVERSE AS A COMPUTER

What I love about this and also about the previous list is that it is a mixture of the mundane and slightly insane. When I think of Wheeler I often think of this contradiction, in that he was very conservative in many respects, but also completely open to very odd ideas (there is only one electron in the universe !) Or to quote his student Richard Feynman, “Some people think Wheeler’s gotten crazy in his later years; but he’s always been crazy.”

In this list the one that jumps out as me is “Mastery of the flow and possibilities of physical information on the quantum scale, or in matter at all scales simultaneously, may equip man with the highest possible form of technology or mean that he has at last triumphed over Nature; and the secret to such mastery may be to treat the universe–scientifically or technologically–as a computer.” I think in this quote, from 1980, we’ve now tracked down the origin of all of the quantum computing hype!

Another one that I find great to see is 39, “The universe may contain an indefinite amount of things that are hidden, just as a computer may hide its contents.” In many ways, this is is the content of the Kochen-Specker-Bell theorem which shows that quantum theory is contextual.

Also 7, where he suggests the universe may be a representative democracy, isn’t something I expected, but I’m hoping Ted Chiang sees this blog post and decides to write a story about this idea.

I’m also thinking I’m not the only one who had to look up the pullulate (“multiply or spread prolifically or rapidly”).

THE UNIVERSE AS A COMPUTER, John Archibald Wheeler

John Archibald Wheeler is a bit of a hero for me (and also, like all good heroes a bit of a villain). Discovering his paper “It from Bit” was definitely a huge inspiration for me to get into the field. When I found Wheeler’s paper it led me to Bill Wootters work, and I immediately charged my parents hundreds of dollars to get a copy of very paper Bill had written (I mean who wouldn’t want to read “Quantum mechanics without probability amplitudes“) Amusingly these days I think many who claim the mantle of “it from bit” have not actually read the paper, which is quite radical, you should definitely stop reading this blog and read the paper if you have not.

Because Wheeler is someone who I’ve always been interested in, I was Googling (company plug) around the other day and found out the the American Philosophical Society has a collection with papers, notes, etc from Wheeler. Among these is a typed up note that I don’t think ever made it into a paper, but which I really loved. The title of the note is “THE UNIVERSE AS A COMPUTER”, and is dated “[1980?]”. In it Wheeler first lists possible meanings for the expression “the universe as a computer” followed by possible implications for this metaphor. And by list, I don’t mean a small list, he puts down 48 possible meanings. I love it.

Here are Wheeler’s possible meanings for the metaphor:

THE UNIVERSE AS A COMPUTER |  [1980?]

Possible Meanings of the Metaphor

NOTE: Such metaphors tend to be at once useful and misleading. They are understood only intuitively, vaguely, partially, ambiguously, and abstractly by their authors and users. They are apt to be polysemous–possessed of  multiple and complex meanings, both relevant and irrelevant. Moreover, in the present instance the utilitarian and natural meaning of the very words universe and computer is unknown to an indefinite degree! Physicists who are trying to conceive a new cosmology based in part on the metaphor of the universe as a computer may find the following list clarifying, heuristic, provocative, useful for self-criticism or discussion, &c. It is a partial list that needs to be extended, systematized, edited, explained, and illuminated by examples and corollaries.

1. May be more or less similar or identical to a Turing machine or serial computer, or be wholly or partly I-dimensional.

2. May comprise the equivalent of serial computers operating, either independently or interdependently, in a parallel array.

3. May resemble more nearly a completely parallel computer whose elements are not sub-computers.

4. May represent a hybrid serial and parallel computer.

5. May represent a hierarchy, network, and/or series of many serial or parallel computers.

6. May represent an infinite or finite set of computers.

7. May represent a computer of infinite size, complexity, subdivisibilit (componentry), age, lifespan, sophistication, perfection, power, connectivity, programming, knowledge, intelligence, dimensionality, activity, strangeness, and/or the like.

8. May (or physics may) reduce to pure mathematics, numbers, or ‘order’.

9. May use or reduce entirely to information, symbols, computer-like rules, states, decisions, operations, markers, pointers, arrays, structures, programs, sets, and/or the like.

10. May be like a computer in being instructible and programmable–or re-instructible, re-programmable, controllable, and manipulable. |

11. May reduce to a computer at a higher, lower, or ultimate level (scale).

12. May be treated as equivalent to a computer if the treatment is sufficiently elaborate, universal, clever, and/or absolute.

13. May represent a set of (finitely or infinitely) homogeneous or heterogeneous computers.

14. May represent a digital or binary computer.

15. May represent an analogue computer of finite, infinite, or infinitesimal self-similarity.

16. May be mechanical in the sense of being possessed of machinelike entities or phenomena–or of parts, operations, laws, &c that are surprisingly or absolutely simple, regular, interchangeable,  interlocking, perfect, universal, reliable, deterministic, knowable, predictable, finite, utilitarian, teleological, rational, constructible, symmetric, efficient, irreducible, repetitive, and/or the like.

17. May behave like a computer on occasion or in special situations.

18. May use a (central or omnipresent) model of itself.

19. May be simulated by a computer with arbitrarily great accuracy.

20. May represent a superficial pattern projected, in effect, on a background–or something like a program or simulation of a universe that is running on an independent and truly real computer but 1s not itself real or fundamental.

21. All natural phenomena, entities, and systems (be they trees, rocks, : molecules, bacteria, men, societies, rivers, stars, diseases, clouds, or whatever) may be computers or computer-like (have  programs, perform computations, use circuitry, possess memory, use languages, process information, use Boolean logic, or the like).

22. May be clock-like or use universally or locally synchronized parts or processes. 

23. All known laws may be controlled or created by higher laws (possibly  arranged in a hierarchy or network). :

24. The universe or all physical phenomena may be reducible to computation or a single great calculation.

25. May reduce to or be controlled by, or be expressible as, pure (Boolean or non-Boolean) logic.

26. May be a mind, mind-like, or thought-like, or be reducible to or treatable by pure thought.

27. May possess a (finite or infinite) computer-like or other memory; or memories and memory processes may exist in all natural phenomena (trees, rocks, genomes, &c).

28. May wholly or in part consist of cellular automata, or be a single great or infinite cellular automaton.

29. Physical processes, phenomena, entities, events, quantities, laws, states, and relationships (such as apparent movement, mass, fundamental physical constants, particle populations, 3-space, spacetime, volcanoes, and ocean waves) may be representational, programmatic, or computational illustons~~-or be infinitely ambiguous–rather than truly fundamental or real; and per se, may be alterable or transcendable by gaining knowledge of and control over their ‘programs’ or an equivalent. Put otherwise, all phenomena, &C may appear or behave as they do because of internal programs, self-models, rules, or computation to which man can gain access.

30. May be a lattice–or spacetime may be wholly quantized.

31. All quantum numbers may reduce to a single quantum number.

32. May essentially represent but a single, individual particle, event, or computer (that somehow generates the illusion of a multifarious world); or a single iterative or recursive operation repeating itself forever or toward a finite future destiny.

33. May represent more or less homogeneous or heterogeneous iterative or recursive processes, entities, phenomena, laws, quanta, measurements, disturbances, &c.

34. It may be possible to show that information and computation are fundamentally indistinguishable and hence equivalent, or that all of the following must in a similar way be equivalent: information, computation, energy, mass, space, time, and/or the like.

35. May be an asynchronous computer (computer with asynchronous parts).

36. May be a computer that functions statistically and indeterministically.

37. May be more or less lifelike (possessed of biological features such as homeostasis, growth, self-reproduction, self-evolution, competition, purpose, or memory).

38. As science progresses and becomes more complex, difficult, mathematical, and abstract, reliance on the computer may become total; and this may make a purely computational representation of the universe expedient, and justify present-day steps in that direction.

39. No one knows how simple, powerful, universal, natural, and/or complex computers might ultimately become–and it might be in these  ultimate senses that the universe is or resembles a ‘computer’.  (It might even be possible to devise revolutionary types of computers by studying and applying computational or computer-like properties of the universe.)

40. May function in ways similar or identical to such computer or mental  processes as generalization, recognition, categorization, error  correction, time sequence retention, induction, symbolic logic, analogical reasoning, and/or the like.

41. May use holonomical memory, correlations, or laws, something like a computer’s content-addressable memory, and/or the like. |

42. May represent an anastomotic network of everywhere diverging, converging,and interacting causes and effects, and in this way resemble a computer’s structure and functioning.

43. May be in some profound sense a function of the mind, and hence half-mental; or respond in entirely different ways dependent on how it is ‘addressed’.

44. Local phenomena may be controlled from a distance or by larger systems.

45. May represent a great hierarchical network of specialized ‘administrators’ or ‘administrative! processes, functions, systems, laws, constraints, &c. Also, may contain things analogous to questions, answers, experiments, orders, requests, negotiations, conversations, messages, traffic cops, supervisors, inspectors, translators, arbitrators, pioneers, &c.

46. May be totally finitistic–or finite everywhere and in every way.

47. May be ‘centralized’ and ‘centralistically’ controlled.

48. All that exists in the universe (including relationships, entities, interactions, laws, &c that are conventionally thought of as being inert, static, or time-invariant) may in fact be time-asymmetric or an uninterrupted process of change or of cosmoplastic or cosmopoietic interadjustments and interchanges; in this ‘everywhere~ always~novel universe! work and information-processing could be omnipresent and quintessential.

THE UNIVERSE AS A COMPUTER

There are a couple of things that struck me in this list. The first is number 39. “39. No one knows how simple, powerful, universal, natural, and/or complex computers might ultimately become–and it might be in these  ultimate senses that the universe is or resembles a ‘computer’.  (It might even be possible to devise revolutionary types of computers by studying and applying computational or computer-like properties of the universe.) ” If this is truly 1980, this is pretty amazing because it is at its heart a statement that the universe’s computation may be very powerful, and that this might lead us to more powerful computers. This is right around the time Manin is credited with mentioning the idea of quantum computation in his book Computable and Uncomputable and before the 1981 Physics of Computation conference, which I think is rightfully considered a good point to take as the starting time for quantum computing.

The second thing that I love is Wheeler. “may in fact be time-asymmetric or an uninterrupted process of change or of cosmoplastic or cosmopoietic interadjustments and interchanges” I mean come on this is awesome who writes like that these days? Wheeler’s writing was so full of this crazy poetry, that it famously got parodied in “Rasputin, Science, and the Transmogrification of Destiny” by “John Archibald Wylers”. Excellent stuff.

I was lucky enough to work with a brilliant graduate student, Ben Toner, where we looked at the information cost of simulating entanglement. When we completed that work, Ben was able to attend a conference in honor of Wheeler, and I believe he told him about our work. I’m sad I never got to meet him, but I hope he could recognize in Ben talking with him, the long echo of his impact on those of us who still dream more about computation and the universe.

Acronyms Beyond NISQ

NISQ is a term coined by John Preskill1 circa 2018 and stands for “Noisy Intermediate-Scale Quantum”. The term is aimed to describe quantum computers that were not just toy few qubit systems, but systems of a slightly larger scale. This “slightly larger” is a bit hard to define, but roughly most people take it as what is achievable with a quantum computer that does not use error correction. Or in other word the “intermediate” means roughly “what you can do with the natural fidelities of your qubits” (with a fudge factor for those who want to plug their nose and use error mitigation.)

Now this is a blog, so I will give my opinion. And that is that word intermediate in NISQ drives me nuts. I mean in part because it is vague (intermediate between what?), but more because the word itself is a disaster. Intermediate comes to us from Latin, being a combination of inter, meaning “between”, and medius meaning “middle”. But this is silly how can there be a middle without being between? It’s like saying “middle middle”. Whenever I hear NISQ I am reminded of this bastard doubling, and I start working on a time machine to go back in time and work on etymological corrections (good short story idea: a society of time travelers whose sole goal is to bring reason to the etymological record).

A more interesting question than my own personal hangups on word origins is what should we call what exists on the other side of intermediate. My friend Simone Severini has used the term LISQ which stands for “Logical Intermediate-Scale Quantum”2. The idea, as I understand it, is to use this term to refer to the era where we start to construct the first error corrected quantum devices. In particular it is the place where instead of using raw physical qubits one instead uses some logical encoding to build the basic components of the quantum computer. (<high horse>Of course, all qubits are encoded, there is always a physical Hamiltonian with a much larger Hilbert space at work, what we call a qubit subsystem is a good approximation, but it is always an approximation.</high horse>). I am exciting that we are indeed seeing the ideas of quantum error correction being used, but I think this obscures that what is important is not that a qubit is use error correction, but how well it does that.

I want to propose a different acronym. Of course, I will avoid the use of that annoying term intermediate. But more importantly I think we should use a term that is more quantitative. In that vein I propose, in fine rationalist tradition, that we use the metric system! In particular the quantity that is most important for a quantum computer is really the number of quantum gates or quantum instructions that one can execute before things fall apart (due to effects like decoherence, coherent imprecision, a neutral atom falling out of its trapping potential, or a cataclysmic cosmic ray shower). Today’s best perform quantum computations have gotten signal out of their machine while reaching somewhere near the 1000 gate/instruction level3. We can convert this to a metric prefix, and we get the fine “Kilo-Instruction Scale Quantum”. Today’s era is not the NISQ era, but the KISQ era.

And as we start to move up the scale by using error correction (or somehow finding natural qubits with incredible raw fidelities) we then start to enter the regime where instead of being able to run a thousand instructions we start to be able to run a million instructions. This till be the “Mega-Instruction Scale Quantum” era or MISQ era. And I mean how cool will that be, who doesn’t love to say the word Mega (Just don’t drawl your “e” or you might stumble into politics). Then we can continue on in this vein:

  • 103 instructions = KISQ (kilo) = NISQ
  • 106 instructions = MISQ (mega)
  • 109 instructions = GISQ (giga)
  • 1012 instructions = TISQ (terra) <– Shor’s algorithm lives around here

An objection to this approach is that I’ve replaced the word intermediate with the word instruction and while we gain the remove of the “middle middle”, we now the vague term instruction. The word origin of instruction is a topic for another day, but roughly it is a combination of “in” and “to pile up”, so I would argue isn’t doesn’t have as silly an etymology as intermediate. But more to the point, an “instruction” has only an imprecise meaning for a quantum computer. Is it the number of one and two qubit gates? What about measurements and preparations? Why are we ignoring qubit count or gate speed or parallelism? How do we quantify it for architectures that use resource states? To define this is to fall down the rabbit hole of benchmarks of quantum computers4. Benchmarking is great, but it always reminds me of a saying my grandfather used to tell me “In this traitorous world, nothing is true or false, all is according to the color of the crystal through which you look”. Every benchmark is a myopia, ignoring subtleties at the cost of quantiative precision. And yes, people will fudge any definition of instruction to fit the strengths of their quantum architecture (*ahem* algorithmic qubit *ahem*). But terms like NISQ are meant to label gross eras, and I think its okay to have this ambiguity.

One thing I do like about using the metric prefix is a particularly pressing problem. While it has been a real challenge to find NISQ algorithms that have “practical” (whatever that means5) an equally pressing problem is what sort of quantum algorithms will be achievable in the MISQ era. The place where we have the most confidence in the algorithmic advantage offered by quantum computers, simulation and experimental math algorithms (like factoring), lie above the GISQ and probably in the TISQ region. Roughly what we need are quantum algorithms that are linear time algorithms, so that for instances sizes becoming non-trivial (say a thousand), their total spacetime volume is a this size squared. And while there has been work on algorithms in this era, I would not say that we confidently have algorithms we know will be practically useful in MISQ. And this MISQ/GISQ gap is extremely scary!

So long live the NISQ era! And onward and up to MISQ and beyond!

  1. “Quantum Computing in the NISQ era and beyond”, Preskill arXiv/1801.00862 ↩︎
  2. “Bye NISQ. Hello LISQ?”, Simone Severini LinkedIn post ↩︎
  3. As an example “Phase transition in Random Circuit Sampling” by the Google group (of which I’m a member) shows a signal for circuits with 32 cycles and 67 qubits. arXiv/2304.11119 ↩︎
  4. A prominent benchmark is Quantum Volume, defined in “Validating quantum computers using randomized model circuits” by Cross, Bishop, Sheldon, Nation, and Gambetta arXiv/1811.12926. This is a fine benchmark modulo that because Executives at BigCo’s apparently can be fooled by log versus linear scale, they really should have taken the log of the quantity they use to define the Quantum Volume. ↩︎
  5. My own personal opinion is that current claims of “quantum utility” are an oversell, or what we nowadays call quantum hype, but that is a subject for a beer at a quantum beer night. ↩︎

Qafblog

“We are in a box,” says me.

“Do you see some radium hooked up to a crazy steampunk device with skulls and crossbones and yellow, definitely yellow, but maybe also neon green or wavey blue?” says Qubitslets.

“No I think we put ourselves in the box,” says me. “I don’t see any radium.”

“Maybe we’re in that branch of the wave function where we’ve transmogrified ourselves into a simulation. And we’re in a box because those post capitalists are too damn cheap to simulate us outside of a small goddamn box. Just like new Seattle townhouses. We’re in the cheap Android version of a tech bros afterlife. Do you see brass?”

“No brass,” says me, “but there is something growing in the corner.”

“A tunnel?” asks Qubitslets, “Remember that tunnel we were digging to the moon? I’ve been thinking about the replica symmetry breaking structure of our many tunnel passages to the moon. Maybe we should leverage quantum effects? Jesus, did I just say that? Have I been infected with Goldman Socks ibank spin 3/2s disease? At least I’m a fermion I guess. Not like those collective Marxist bosons over at Google Vultures.”

“I will remind you of the last time we used quantum effects,” says me. “We ended up changing the vacuum state from suck to blow. Luckily it was an unstable equilibrium, and because we are particle theorists and not cosmologists, we didn’t have to think ‘equilibrium of the Higgs-Anderson-Goldstone field with respect to what?’, so the universe just relaxed back to its current vacuum. I sure do miss the werewolves from that old vacuum, though. No, the thing growing in the corner is in a jar.”

“Crap”, says Qubitslets, “we’ve been quarantined!”

Suddenly the Medium Cryostat materializes from The Void. “There is no virus and if there were viruses they would be foreign viruses with foreign RNA and a sheath of foreignness so tricky in its conformal structure we would need a wall to stop it. Because there is no virus, which there certainly isn’t, we must build walls around us and between us and under us. The Great Walling must begin. And we must stay behind our walls and only go out to visit our grandma if she is in a nursing home but we can only do that if we test grandma for foreign viruses, which don’t exist, and, of course we must stay six feet from any particle in the universe. A foreign virus has a Compton wavelength of six feet, I am told.”

“In light of us not being afraid of viruses, because they don’t exist, we will need to plan for how The Great Economy can survive the foreign viruses, if they existed. Stock buybacks were insurance claims for the future antibodies the corporations of The Great Economy would need during The Great Walling, so we should cash out their claims. Because of the uniform structure of economic strata across our Great County, we can use the base minimum wage to pay the minimum required to sustain minimal substance for those impacted by the viruses. If the viruses existed.”

“But we must also not forget what made this a Great Country, again. Never forget the resilience of our people to the scientific method, again. Of the possibility of our citizens being able to think in terms of counterfactuals, again. And of our dedication to spring break and Easter services and a Latin homily about licentious spring breakers, which no, does not arouse the Medium Cryostat. Amen. Oops, I mean Again.”

“And because we are a generous Medium Cryostat, and we know that living behind walls is hard (but necessary because maybe viruses), we shall provide to every home everywhere a jar of Sourdough Levain. Lactobacilli and yeast for all, because nothing says that you aren’t scared of invisible microscopic viruses like cooking with a self reproducing jar of sticky goo.”

“Hooray!” says Qubitslets, “we get to bake bread!”

“Did I ever tell you about the time my girl forgot to feed the starter,” says me ,”and the starter died, and in the tears that fell into the hooch that was all that remained, I could see that the relationship was over by studying the way the waves spread and reflected off the walls of the jar?”

“I bet we can study the exponential growth of the yeast and use that to model viruses,” says Qubitslets. “As physicists we know that only simple models that use physics concepts can be used for public health decisions.”

“Or maybe in the replication of the yeast, we’ll discover that we are just cellular automata or hypergraphs transforming under some crowdsourced update rule.”

“But no brass,” says Qubitslets.

“Yeah, no brass.”

** This post is a tribute to the best blog that every was, ever will be, and ever could be. Fafblog you are greatly missed.

The open access wars

Vox has just published an excellent article on open access scientific publishing, “The open access wars“. While that is too provocative of a title, the article still manages to give an accurate assessment of the current state of play. Although I like the article, I can’t help but nitpick a few points.

From the article,

“We spoke with executives at both Elsevier and Springer Nature, and they maintain their companies still provide a lot of value in ensuring the quality of academic research.”

This is false. Publishers do not add any significant value in ensuring the quality of the academic research. Peer reviewers do that, and even then not consistently. True, the publishers facilitate finding peer reviewers, but this has nothing to do with the actual publishing of the research. The role of the journal itself (sans peer review) is just to market and advertise the research, not to ensure quality. The journal also provides a venue for signaling for the prestige-smitten researcher. It is a personal value judgement how much these other things matter, but they certainly don’t impact the quality of the research.

Organizing the peer review (as opposed to actually reviewing) is the job of a middleman: it may provide a service, but it doesn’t add value to the product and it only drives up prices. This is why non-profit overlay journals like Quantum only cost between 0 and €200 to publish. The average cost per submission for hosting a preprint on the arxiv is less than $7.

Which brings me to my next point: I was also a little disappointed in the article that they failed to mention arxiv.org. They do mention the rise of preprints, but they only mention prepubmed.org, which apparently only came online in 2007. By contrast, the oldest arxiv preprint that I’m aware of is Paul Ginsparg’s notes on conformal field theory, which were posted in 1988!

That might be the oldest timestamp, but the arxiv only started having regular preprint service in 1991. Still, this means that essentially all physics research in the last 25+ years is available for free online. In practice, this means that any time you need a physics paper, you simply find the associated preprint and read that instead of the journal version. This is especially convenient for a field like quantum information, where all but a handful of papers are available on the arxiv.

Any article on open access should lead with a discussion of the arxiv. It’s one of the most important science-related developments of the last 30 years.