Run it all again#

The very first test is that your code must run, beginning to end, top to bottom, without error, and ideally without any user intervention. This should in principle (re)create all figures, tables, and numbers you include in your paper.

TL;DR#

This is pretty much the most basic test of reproducibility. If you cannot run your code, you cannot reproduce your results, nor can anybody else. So just re-run the code.

Exceptions#

Code runs for a very long time#

What happens when some of these re-runs are very long? See later in this chapter for how to handle this.

Making the code run takes you a very long time#

While the code, once set to run, can do so on its own, you might need to spend a lot of time getting all the various pieces to run. This should be a warning sign: if it takes you a long time to get it to run, or to manually reproduce the results, it might take others even longer. Furthermore, it may suggest that you haven’t been able to re-run your own code very often, which can be correlated with fragility or even lack of reproducibility. We address this partially in the next section [1].

Takeaways#

  • your code runs without problem, after all the debugging.

  • your code runs without manual intervention, and with low effort

  • it actually produces all the outputs

  • your code generates a log file that you can inspect, and that you could share with others.

  • it will run on somebody else’s computer

Why is this not enough?#

  • Does your code run without manual intervention?

    • Automation and robustness checks, as well as efficiency.

  • Can you provide evidence that you ran it?

    • Generating a log file means that you can inspect it, and you can share it with others. Also helps in debugging, for you and others.

  • Will it run on somebody else’s computer?

    • Running it again does not help:

      • because it does not guarantee that somebody else has all the software (including packages!)

      • because it does not guarantee that all the directories for input or output are there

      • because many intermediate files might be present that are not in the replication package

      • because you might have run things out of sequence, or relied on previously generated files in ways that won’t work for others

      • because some outputs might be present from test runs, but actually fail in this run