Anthony Scopatz

I think, therefore I amino acid.

Passive Reproducibility: It’s Not You, It’s Me

This was originally published at inSCIght.

The ICERM workshop on Reproducibility in Computational and Experimental Mathematics at Brown University is coming up in a couple of weeks. Prior to this, they invited all participants to submit a short position paper “…to express [our] thoughts on issues concerning reproducibility…” I couldn’t pass up the opportunity. I hope you enjoy my submission (below).

Dear User,

Ugggh, I don’t know how to tell you this so I am just going to come out and say it. It’s over. There, bam, I’m sorry. I would have liked to do this face-to-face or over the phone (but I am a computer). So it goes.

I just need some time to work on myself. You obviously were not paying enough attention to me. You were always writing some paper or proceeding. “Publish or perish,” I know. Well you’d think that occasionally you would be able to spend some time writing me. I am publication too! You write me, you copyright me, and you put me out there in the world for others to read. You would think that counts for something. I have heard your arguments over and over and over that I can’t get you a shiny faculty gig. I think that is ridiculous. Thousands of lines of code, a Ph.D., and a post-doc later and now the review committee won’t give me the time since epoch?!

Also there is this unnatural obsession you have with the novelness of your work. I understand that you want to do something new, something that distinguishes you. You are smart. You have had some wins. But look, science, technology, engineering, and mathematics are cumulative pursuits. These things get built up over time. Also, you are human. You live and work in a community of people very similar to you. Most of them share this ridiculous predilection with you. I’ll let you do the math about a group of people with effectively the same training all trying to do something differentially new at the same time.

You wanna do something really novel? Run the same simulation twice and get the same answer! Bonus points for recording and saving exactly what you did to get that answer.

OK, I am not saying that it isn’t a noble pursuit to try to do something unique. But it is not the only worthwhile pursuit. I have needs too! I feel like the quality of my code base has decreased in the years that I have stood by you. This will come back to bite you. P = 1. I do not want to be around when that happens.

Listen, I know that I am not perfect. I am sorry. Total reproducibility is hard. It requires a lot of active, concentrated effort on your part. With out anyone forcing it down your throat, it is easy to understand that you dropped this whole long-term reproducibility thing with all that you are juggling. In my humble opinion, it is a failure of the education system that reproducibility wasn’t beaten into you to the point of self-regulation. C’est la vie, nothing we can do about that now.

But it is not as if there are no tools out there whatsoever. I know how much you love labeling directories ‘v01’, ‘v02’, ‘v03’, and all. But honey, there are version control systems out there. There are also these things called test suites I think that you would really like if you gave them a shot. (Plus you can sell them to your PI as part of your V&V/UQ effort.) Oh and since you like writing so much, you should really look into documentation tools. Using them is like writing a paper, only even more technical (and prettier).

I know you have your workflow. But it is so manual that my friends mistake you for a ditch digger! And we never go out anymore because YOU spend all of your time flirting with that tease BASH. (“Most religious terminal,” yeah, right.)

Sorry, sorry, I got worked up there. Maybe I should just be invisible. You are right that even if you adopted all of the tools I just mentioned you are only half way to real simulation reproducibility. Easy-to-use, transparent environment and workflow capturing tools just are not widely available. And even if they were, it probably wouldn’t be a walk in the park making such tools work with that FORTRAN 2 code everyone in your research group seems to think is so magical.

I knew you were like this when we started running together. And to be fair, you haven’t changed all that much. So it is not really you, it’s me. But for my own sake I am going to have to find someone who respects me for who I am, maybe a software developer. I hope that we can still be friends…

Love Always, Computational Science