Bayesian models for the analysis of botanical citizen science data: Understanding bias and improving inference

Bayesian models for the analysis of botanical citizen science data: Understanding bias and improving inference

Can biased, but abundant, citizen science data on plants be used to effectively monitor environmental change at different scales? And, if so, can such data also be used to infer the impacts of a changing environment on plant communities and the ecosystem services they provide?

Aims of the project

1. To investigate the most effective ways of modelling biased citizen science data on plant species occurrence and abundance throughout Britain.

2. To use novel Bayesian approaches to community modelling to investigate the effects of different approaches on subsequent inference (e.g. combining single species trends versus whole community modelling).

3. To investigate whether such results can also be used to reveal changes in plant communities and ecosystem services at different scales.

Project Description

Biodiversity data collected by amateur (often expert) naturalists, today sometimes known as citizen science data, are an extremely important source of information about the natural environment, and such data often used to investigate environmental change. Certain types of Bayesian hierarchical models are now widely used to model the spatial and temporal biases inherent to such data, resulting in more reliable inference and decision making.

The Biological Records Centre at CEH Wallingford already has experience in applying Bayesian occupancy models to citizen science biodiversity data (e.g. see Isaac et al. 2014, MEE, We also have access to 40 million vascular plant records collected by the Botanical Society of Britain and Ireland for a forthcoming national atlas, and another ~4 million bryophyte records collected by the British Bryological Society. The questions outlined above under ‘Aims’, particularly those pertaining to the effects of spatial and temporal scale on modelled biodiversity trends, could be rapidly investigated by a student with a developing interest in Bayesian modelling. Recoding our existing models and workflow into the Stan language (to achieve faster, more robust inference, and to take advantage of the increasingly sophisticated Stan modelling ecosystem []) would also be an initial aim. The student would also have many opportunities to see how such biodiversity are collected in the field. Once optimal models and analytical scales for vascular plant and/or bryophyte citizen science data were found, the project would focus on investigating the use of joint models for multiple species (e.g. Warton et al. 2015, Trends Ecol. Evol., 30(12):766-779); such recently developed approaches allow for the integration of multiple species, environmental covariates and species traits into a single model (providing a Bayesian alternative to such approaches as the popular multivariate ‘fourth corner’ method). Depending on the results of the early phases of this project, the developed approaches to plant citizen science data would be applied to such questions as estimating the impacts of invasive species across scales, and issues such as trends in ecosystem services across space and time.

Methods to be used

Hierarchical Bayesian modelling approaches would be the main methods to be used, but they would be compared with other approaches to modelling species’ trends using citizen science datasets (Isaac et al. 2014). Outputs would also be quality checked and assessed by field researchers and other botanists.

Skills required

An interest in statistical modelling is essential.

An interest in citizen science would be desirable.

Some skills in plant identification and ecology would be also a benefit, as would some understanding of taxonomic and nomenclatural principles, although the successful candidate would have the opportunity to develop skills in this area.

If interested please contact Stephen Harris

Associated Pages