What is the most uniquely biodiverse place in Canada?

Well this statement certainly need to be qualified – in this post I will explore coupling species distributions models with @LegeDeC13 approach of measuring site-level contribution to β-diversity. In case you are curious, the entire code (100% Julia) is available on GitLab. Specifically, I am interested in finding out where the waterfowl communities are the most distinctive.

The LCBD (Local contribution to β-diversity) is a measure which belongs to a family of approaches, relying on estimating β-diversity through the variance of a community data matrix $\textbf{Y}$, and requires a reduced number of transformations. It is therefore really fast to calculate, so I wanted to explore how usable it would be to highlight areas of high uniqueness. The usual application of this method is to compare discrete sites, or sampling units, and so the community matrix is often relatively small. But from a communication point of view, it is useful to show, on a map, which parts of a territory are exceptionally important.

The approach I used is as follows.

First, get a list of Canadian waterfowl from GBIF (I focused on the species that were observed at least once in the last 600 waterfowl observations, which introduces obvious biases, but is good enough for proof of concept). This resulted in a list of 35 unique species.

For each of these species, I retrieved up to the latest 800 issues-free occurrences. This represents different temporal coverage based on the rarity of the species or its attractiveness to birders, but is still good enough for proof of concept code. Because of the rate limits of the GBIF API, this took about one (1) lunchtime.

For each of the species, I trained a bioclim model on all 19 bioclim variables, and assigned the species to be present in any pixel for which $\text{Pr}(S) > 0$. This is clearly an over-estimation of the range of the species, but I am very much hoping that these differences will be marginal compared to the spatial scale of the entire dataset. In addition, my implementation of bioclim takes under a tenth of a second for each species, which is essential for exploration.

Finally, for every pixel in the map (all 45 thousands of them or so), I calculated the LCBD.

Here is the result.

@HeinGron17 make the point that high LCBD mostly indicates species-poor locations – which makes sense, as one step to get the LCBD is to measure the sum of squares across rows and columns, and so only the marginal sums are involved (as opposed to the entire matrix) – as a consequence, the LCBD is not a measure of the over-representation of rare species, as much as it is a measure of distinctive assemblages.

There a few things that are not quite right in this analysis (notably the inclusion of sites where none of the species were present). But the key point is that the LCBD approach can be applied to more continuous scenarios, to produce maps of biodiversity distinctiveness.