Should I optimize my code?


Oh, you want more details? OK. Code optimization is, broadly speaking, the practice of modifying code in a way that maintains its behavior but changes its performance. If you write code to tell you that 2+2 is 4, and takes 12 minutes to do so, an optimized version of the same code will tell you that 2+2 is 4 but will only require 8 minutes, which is a rather good improvement. You get the idea.

The first thing to know about code performance is that it takes a lot of time to improve. A large part of optimization is undoing unwise decisions made early in the project, that have driven a lot of subsequent decisions, and you have to ensure that changes in the name of speed do not break your code. My own heuristic to decide on whether I will optimize code has three questions. How much faster do I think I can make it run? How frequently will I run it? How long will it take me to get the maximum performance? In many cases, the time I will spend optimizing will be longer than the time I will gain by optimizing. Hence, I rarely optimize.

But in most cases, code speed is the wrong question. If I were to look at myself when I’m “programming”, I would say that my time is 80% thinking about stuff and writing it down on paper, 10% reading the documentation, 8% is actually typing and debugging, about 2% is running things (this doesn’t apply at all to very intensive computing projects, but these are a special case, and if you do these sort of things you know what you’re doing).

Taking a step back, “optimization”, at the scale of a project, would focus on the most time-consuming task. Good documentation, a clear API, informative error messages, and sensible keywords, are things that will make the entire project faster. Computer time is non-blocking; I have no issue with letting a task run while I go home, and check out the results in the morning. Our own time is much more valuable; if I need to spend an hour reading the documentation for something I can express in 10 minutes, then an obvious target for optimization is to write a better documentation.

From a developer point of view, everything is a trade-off. Time spent on making the code faster is time that is not spent on making the test suite complete, the documentation effective, the use cases compelling, or the API reasonable. And so I think it is useful to decompose a project into three steps: make it run (which is a matter of hours), make it right (on a timescale of days), and make it fast (on a timescale of weeks or months, and a lot of pain and curse words).

There are cases when optimization is necessary. If you use shared computing resources, yes, you owe it to other users to extract the last drop of performance (not only raw speed, but also memory footprint and I/O) from your code. But for the majority of cases in ecological research, optimization is the wrong target when it focuses only on the code. In brief, code I can trust is more useful to me that code that is fast. Good documentation and structure decreases the time spent on a project, probably more effectively than a 10% gain in performance would.