Should Papers Have Unit Tests?

Perhaps the greatest shock I’ve had in moving from the hallowed halls of academia to the workman depths of everyday software development is the amount of testing that is done when writing code. Likely I’ve written more test code than non-test code over the last three plus years at Google. The most common type of test I write is a “unit test”, in which a small portion of code is tested for correctness (hey Class, do you do what you say?). The second most common type is an “integration test”, which attempts to test that the units working together are functioning properly (hey Server, do you really do what you say?). Testing has many benefits: correctness of code, of course, but it is also important for ease of changing code (refactoring), supporting decoupled and simplified design (untestable code is often a sign that your units are too complicated, or that your units are too tightly coupled), and more.

Over the holiday break, I’ve been working on a paper (old habit, I know) with lots of details that I’d like to make sure I get correct. Throughout the entire paper writing process, one spends a lot of time checking and rechecking the correctness of the arguments. And so the thought came to my mind while writing this paper, “boy it sure would be easier to write this paper if I could write tests to verify my arguments.”

In a larger sense, all papers are a series of tests, small arguments convincing the reader of the veracity or likelihood of the given argument. And testing in a programming environment has a vital distinction that the tests are automated, with the added benefit that you can run them often as you change code and gain confidence that the contracts enforced by the tests have not been broken. But perhaps there would be a benefit to writing a separate argument section with “unit tests” for different portions of a main argument in a paper. Such unit test sections could be small, self-contained, and serve as supplemental reading that could be done to help a reader gain confidence in the claims of the main text.

I think some of the benefits for having a section of “unit tests” in a paper would be

  • Documenting limit tests A common trick of the trade in physics papers is to take a parameter to a limiting value to see how the equations behave. Often one can recover known results in such limits, or show that certain relations hold after you scale these. These types of arguments give you confidence in a result, but are often left out of papers. This is sort of kin to edge case testing by programmers.
  • Small examples When a paper gets abstract, one often spends a lot of time trying to ground oneself by working with small examples (unless you are Grothendieck, of course.) Often one writes a paper by interjecting these examples in the main flow of the paper, but these sort of more naturally fit in a unit testing section.
  • Alternative explanation testing When you read an experimental physics paper, you often wonder, am I really supposed to believe the effect that they are talking about. Often large portions of the paper are devoted to trying to settle such arguments, but when you listen to experimentalists grill each other you find that there is an even further depth to these arguments. “Did you consider that your laser is actually exciting X, and all you’re seeing is Y?” The amount of this that goes on is huge, and sadly, not documented for the greater community.
  • Combinatorial or property checks Often one finds oneself checking that a result works by doing something like counting instances to check that they sum to a total, or that a property holds before and after a transformation (an invariant). While these are useful for providing evidence that an argument is correct, they can often feel a bit out of place in a main argument.

Of course it would be wonderful if there we a way that these little “units” could be automatically executed. But the best path I can think of right now towards getting to that starts with the construction of an artificial mind. (Yeah, I think perhaps I’ve been at Google too long.)

This entry was posted in Off The Deep End, Programming. Bookmark the permalink.

5 Responses to Should Papers Have Unit Tests?

  1. sflammia says:

    Do grad students count as artificial minds? That’s a good way to do unit testing.

    Like or Dislike: Thumb up 2 Thumb down 4

    • Pontifex Praeteritorum says:

      If I had to chose, I would say grad students’ minds are less artificial than than their bosses.

      Like or Dislike: Thumb up 2 Thumb down 1

  2. John Sidles says:

    In regard to the automated validation-generation-verification of computer codes, back in the early 2000s the NSF and DARPA/ONR envisioned that valid, efficient C++ dynamical simulation-code could be generated automatically, via type-specifications compiled in Haskell (these were contracts NSF/N00014-05-1-0700 and DARPA-ONR/DMR-9400334, managed by Dennis Healy) .

    An early fruit of this effort was the MIT Electromagnetic Equation Propagation (MEEP) simulation framework. Yet only MEEP 1.0 was Haskell-generated … subsequent generations of MEEP have been hand-coded.

    It’s natural to wonder, what circumstances led to the abandonment of formal validation-generation-verification methods that were originally envisioned by the Haskell/MEEP program? Natural in the sense, that we are contemplating applying similar methods in quantum simulation.

    Needless to say, both the motivations of the Haskell/MEEP project, and the challenges of Haskell/MEEP project, arise even more prominently in quantum simulation than in classical simulation, for mathematical reasons that were posted in the comments to the Ars Mathematica essay “What did Grothendieck do?”.

    If any readers of The Quantum Pontiffs are familiar with the circumstances that led to the abandoning of Haskell code-generation methods by the MEEP project, or have personal experience of any comparable validation-generation-verification project, then comments in regard to lessons-learned would be very welcome.

    Like or Dislike: Thumb up 0 Thumb down 0

  3. aram says:

    For math papers, part of the value lies in the new ideas they contribute. Arguably this may be more important than strict correctness, since mistakes can be repaired more often than uninteresting ideas. Maybe this partially accounts for the lack of interest, since we don’t know how to write proofs in a verifiable way without removing the (human) intuition behind them.

    Like or Dislike: Thumb up 1 Thumb down 0

    • Pontifex Praeteritorum says:

      “since mistakes can be repaired more often than uninteresting ideas.”

      I see what you did there.

      Like or Dislike: Thumb up 0 Thumb down 0

Leave a Reply

Your email address will not be published. Required fields are marked *