It’s been over six months I made the jump from quantum computing theory professor at the University of Washington to software engineer at Google. When I hear from my friends back in quantum computing, the second question they ask is, “what’s it like?” (the first question is whether Google wants to build a quantum computer.) There are lots of answers to this question, but what I think is really interesting is not how I feel about the ins and outs of this new career, but what I found the most surprising about the similarities between my new jobs and my old job. And, for me, hands down, the most surprising similarity is between reading papers and reading other peoples code.
Reading a a paper in a new subject area was definitely one of the favorite parts of my old job (and a central skill to being a good researcher.) You’d start out, often, with only a small clue about the subject of the paper. Often you were led to the paper by a search that hit a few keywords, and an abstract that seemed interesting. But after a few pages it often becomes clear that there are all sorts of terms and ideas that you just haven’t seen before. And so you often have to spend some time doing some reading of other papers that contain the terms you don’t understand and see if they help. Mostly they don’t, but sometimes they do and then you can backtrack and figure out some of what the first paper said. This sort of jumping around, at least for me, occurred quite a bit as I’d try to parse a paper, interspread with periods of logical thought and pen and paper verification of calculations.
Reading other peoples code is very similar. At first you start looking at some class, say, and you have some vague idea what it does. Documentation and implicit documentation through naming gives you some idea of what is going on, but quickly you often see the code start calling code that you don’t know how it works or what it exactly does, and so you have to go track down that other class, and then figure it out, and then backtrack. Of course this is often interspread with bits of following the logic of the code. Today, with modern IDE tools, this sort of jumping back and forth becomes a quick habit and makes the process of figuring out someone else’s code significantly easier.
Which got me thinking. Why aren’t there modern tools for reading research documents that provide some of the functionality that is found in IDEs such as Eclipse? Certainly some authors are gracious enough to compile their LaTeX such that their citation data is a link, but this is a long way from having PDFs where you can click on citations in the text and then you get immediately transported to the other paper, maybe even to the particular location in the paper that is relevant. I think the technical challenge here is providing hooks between the documents: how do I make a citation that is more than just citing the full paper (wouldn’t it be nice if you specified the set of ranges of relevant lines in the paper?) There are certainly very cool tools out there now for storing and parsing your scientific papers, but while the implicit linking between these papers is complex, most of this complexity is buried in the [12] citation pater. But maybe solutions for this are already out there? Thoughts?