This project will be co-supervised between Oxford and the Royal Botanic Garden at Kew
How can the evolutionary history of plant species affect their survival in the Anthropocene?
Extinction risk and past extinctions are not randomly distributed across the animal tree of life. This non-randomness means entire limbs of the tree are at risk of disappearing, not just a lot of leaves and twigs. The same problem may also exist across the 350,000 species of plants, but data are lacking. This question is not merely of academic interest: understanding which plant groups are most susceptible to extinction could help inform timely action to prevent loss of unique evolutionary history, as well as understand better their evolutionary potential to respond to current and future environmental change. However, the sheer diversity of plants (on top of the c. 350,000 known species, there are tens or perhaps hundreds of thousands more still unknown) has made it difficult to assemble comprehensive datasets suitable for testing such hypotheses on a global scale. Studies focused on relatively small clades, or on plants from particular regions have given divergent results, highlighting the need for more inclusive analyses. But while botanists scramble to gather the necessary data, estimates of the proportion of plant species which are threatened with extinction are rising. Can machine learning approaches help us to make best use of the data already available to make robust inferences as to how extinction risk is distributed across the plant tree of life?
Aims of the Project
Develop a new roadmap for exploring phylogenetic (and functional) correlates of extinction risk in angiosperms.
Predict extinction risk in three major plant clades in the Myrtaceae, testing and contrasting published approaches and developing novel methods where required for:
imputation to fill data gaps in phylogenies or trait matrices;
prioritisation of species/areas for more detailed study
ground-truthing results in the field and lab
Many scientists have argued or assumed that, once sufficient data are available, extinction risk and past extinctions will be shown to be unevenly distributed across the plant tree of life, as is the case for animals. Understanding which plant groups are particularly susceptible to extinction can help inform timely action to prevent loss of unique evolutionary history, as well as the loss of evolutionary potential to respond to current and future environmental change. However, the sheer diversity of plants (with c. 350,000 known species and tens or perhaps hundreds of thousands more still unknown to science), has made it difficult to assemble comprehensive datasets suitable for testing such hypotheses. And the relatively small-scale analyses conducted to date have yielded conflicting or inconclusive results.
Through Herculean efforts, botanists have doubled the number of plant species with extinction risk assessments on the IUCN Red List of Threatened species over the period 2017-2019. But those efforts brought total coverage of known plant species to just over 10%; furthermore, the Red List coverage is patchy, with many taxonomic and geographic biases. Understanding of the plant tree of life is similarly patchy and biased, with just 50% of genera represented to date and many tropical plant families very poorly sampled.
While botanists scramble to gather more data, estimates of the proportion of plant species which are threatened with extinction are rising. There is an urgent need to make best use of existing data, to evaluate the extent to which they can provide robust estimates of species and areas most threatened with plant extinction, to quantify the uncertainty in these estimates and to pinpoint the taxa and areas which need to be studied in more detail in order to reduce uncertainty and strengthen the case for conservation targeted action.
This project seeks to apply machine learning approaches to make best use of the data already available to make robust inferences as to how extinction risk is distributed across the (incompletely known) plant tree of life. The student will test a range of published approaches for predicting extinction risk, detecting phylogenetic signal in extinction risk and prioritising taxa and areas for conservation. The (initial) study group will be species of the Myrtaceae family (6,000 species distributed primarily in the tropics and subtropics, with centres of diversity in the neotropics, Australasia and SE Asia). With comparable species diversity to the animal class Mammalia, but with more threatened species, the Myrtaceae family has been identified as an ideal model group for this study as it includes major clades with high levels of data availability, including phylogenetic trees and extinction risk assessments, as well as public recognition (e.g. Eucalyptus in Australia) and others with high ecological importance but relatively low data availability and public recognition (e.g. Myrcia in the neotropics).
The supervisory team has world leading expertise in all of techniques applied here, including expertise in plant systematics, particularly in the Myrtaceae [EL, ENL], in plant biodiversity assessment, phylogeny and biogeography [ENL, FF, EL], and in phylogenetic diversity and analysis and in computational conservation planning [RG, FF]. Further expertise in necessary laboratory and field techniques are available at RBG Kew, the University of Oxford and in local universities in field sites.
Outputs will depend on the student’s choices in developing the project, but objectives will address the following questions of general interest to global conservation biology and policy:
Can the position of a plant species in an evolutionary tree give valuable information about its risk of extinction?
What phenotypic or geographic factors also matter, and how much?
How sensitive are species and area conservation prioritisation recommendations to the systematic, data-cleaning, gap-filling and optimisation decisions made during analyses?
Can targeted data collection reduce uncertainty with respect to priority taxa and area for conservation? If so, where and what should be collected?
Given that successful conservation strategies are co-produced with the people they affect, how do prioritisation schemes based on phylogenetic diversity compare to those based on cultural saliency in terms (i) total plant diversity targeted for conservation; (ii) potential for engagement?
Methods to be used
The study is multi-disciplinary, requiring a willingness to learn species distribution modelling, species risk assessment/Red Listing, neural networks and related ML techniques, as well as phylogenomic sequencing and bioinformatics skills and general statistical and numerical skills. Methods will also include calculation of phylogenetic diversity metrics and their relation to species distribution, habitat type and habitat quality according to the human influence index. Field work methods will also be implemented to ground truth risk assessments and fill gaps in phylogenetic trees. Some systematic methods will be required to understand the three study groups used as ‘models’ of evolution, ecological process and response in the environments on which the study will focus.
Specialised skills required
The student will ideally have experience in managing large datasets, for example in Excel or SQL databases, and basic coding in R or Python, additionally, some experience in phylogenomic sequencing would also be advantageous. These skills are fundamental to the project, so whilst previous experience is desirable, a strong commitment to learning them at a research level is compulsory.
Please contact Richard Grenyer email@example.com and Eve Lucas E.Lucas@kew.org if you are interested in this project