What is computational ecology?

What is computational ecology? I am working on a manuscript to discuss this topic in the context of ecological synthesis. Since it is almost ready, I would love to get some feedback. And so I have pasted below a part of the introduction. If this little experiment goes well, I will add another section next week, before making it public as a preprint. Think of this as a trailer. To an academic article. Because these, I guess, are the times we live in.

Computational science happens when algorithms, software, data management practices, and advanced research computing are put in interaction with the explicit goal of solving “complex” problems. Typically, problems are considered complex when they cannot be solved appropriately with modelling or data-collection only. Computational science is one of the ways to practice computational thinking (Papert 1996), i.e. the feedback loop of abstracting a problem to its core mechanisms, expressing a solution in a way that can be automated, and using interactions between simulations and data to refine the original problem or suggest new knowledge. Computational approaches are common place in most area of biology, to the point where one would almost be confident that they represent a viable career path (Bourne 2011). Data usually collected in ecological studies have a high variability, are time-consuming, costly, and demanding to collect. In parallel, many problems lack appropriate formal mathematical formulations. For these reasons, computational approaches hold great possibilities, notably to further ecological synthesis and help decision-making (Petrovskii & Petrovskaya 2012).

UNTITLED IMAGE

Levin (2012) suggested that ecology (and evolutionary biology) should continue their move towards a marriage of theory and data. In addition to the aforementioned problem of the lack of adequately expressed models, this effort is hampered by the fact that data and models are often developed in independent ways, and reconciling both can be difficult. This has been suggested as one of the reasons for which theoretical papers (defined as papers with at least one equation in the main text) experience a sharp decrease in citation (Fawcett & Higginson 2012); this is the tragic sign that empirical scientists do not see the value of theoretical work, which of course can be blamed on both parties. One of the leading textbooks for the mathematical models in ecology and evolution (Otto & Day 2007) is more focused with algebra and calculus, and not with the integration of models with data. Other manuals that cover the integration of models and data tend to lean more towards statistical models (Bolker 2008; Soetaert & Herman 2008). This paints a picture of ecology as a field in which dynamical models and empirical data do not interact much, and instead the literature develops in silos.

Ecology as a whole (and community ecology in particular) circumvented the problem of model and data mismatch by investing in the development and refinement of statistical models (see Warton et al. 2014 for an excellent overview) and “numerical” approaches (Legendre & Legendre 1998) based on multivariate statistics. These models, however, are able to explain data, but very rarely do they give new predictions. This is, essentially, the niche that computational ecology can fill; at the cost of a higher degree of abstraction, its integration of data and generative models (i.e. models that, given rules, will generate new data) can be helpful to initiate the investigation of questions that have not received extensive empirical treatment, or for which usual statistical approaches fall short.

What is computational ecology? It is the application of computational thinking to ecological problems. This defines three core characteristics of computational ecology. First, it recognizes ecological systems as complex and adaptive; this places a great emphasis on mathematical tools that can handle, or even require, a certain degree of stochasticity (Zhang 2010, 2012). Second, it understands that data are the final arbiter of any simulation or model (Petrovskii & Petrovskaya 2012); this favours the use of data-driven approaches and analyses (Beaumont 2010). Finally, it accepts that some ecological systems are too complex to be formulated in mathematical or programmatic terms (Pascual 2005); the use of conceptual, or “toy” models, as long as they can be confronted to empirical data, is preferable to “abusing” mathematics by describing the wrong mechanism well (May 2004).