Interview with the authors: Mega-fire in redwood tanoak forest reduces bacterial and fungal richness and selects for pyrophilous taxa that are phylogenetically conserved

In a recent paper in Molecular Ecology, Enright et al. examined how soil microbiomes are affected by extreme fires. The Soberanes mega-fire provided the authors with an opportunity to study how such extreme events, which are increasingly common with climate-change, can have lasting effects on ecology. By sampling the soil microbiome before and after the Soberanes mega-fire, Enright at al. demonstrated dramatically altered soil communities and a reduction in species richness associated with the mega-fire. There was a clear phylogenetic pattern to the particular microbes that increased or decreased abundance after the fire. Drawing from their results, Enright et al. propose a framework to predict the traits that post-fire microbial communities might exhibit.

We sent some questions to Sydney Glassman, one of the corresponding authors of this work, to get more detail on this new study.

Aerial view of the Soberanes mega-fire. Photo credit: Calfire

What led to your interest in this topic / what was the motivation for this study? 

I had originally been interested in sampling the redwood tanoak forests of Big Sur because I was interested in what the cascading effects of sudden oak death (SOD) induced mortality would be on soil fungal communities during my PhD at UC Berkeley. Prof Dave Rizzo at UC Davis had a large plot network investigating the effects of SOD on plant mortality. I teamed up with him in 2011 to select a subset of plots to collect soils to investigate the impacts on the soil microbial community via amplicon sequencing. Then, in 2016, I learned that half my plots burned in the catastrophic Soberanes Megafire. It’s extremely rare to have pre- and post-fire samples from the same sampling locations before and after a mega-fire. I was really curious about what the impact of a mega-fire would be on soil microbial communities especially since they had never been studied in redwood tanoak forests before. These forests are endemic and charismatic megflora of Califronia that are facing multiple global change factors and it is really unclear how the soil microbial communities will respond to wildfires and how that will influence the recovery of the vegetation. I had already moved to southern California at this time to start a post-doc at UC Irvine, so I asked Kerri Frangioso, who lived in Big Sur, if she would be able to re-sample any of the plots that burned. Using GPS, she was able to collect soils from the exact same sampling locations that I had sampled in 2011 from 3 of the plots (2 burned and 1 unburned) within 30 days of the fire being declared over. She mailed these soils to me, I extracted the DNA, and froze everything until I was able to start my own lab at UC Riverside in 2018.

What difficulties did you run into along the way? 

The terrain in Big Sur is notoriously challenging to traverse. It is extremely steep, lots of windy dirt roads, and there is a lot of poison oak. There is no cell reception in any of our plots and most are at least an hour from the nearest town.  Collecting the soil even before the fire was challenging enough. However, after fires, it is really challenging to access sites because roads are closed, landslides are common, and dead or dying trees are extremely hazardous especially in the case of wind. We were very lucky to be able to re-sample even 3 of our plots so fast after the fire.

What is the biggest or most surprising innovation highlighted in this study? 

I was really surprised that many of the same pyrophilous “fire loving” microbes that have been found to increase in frequency after pine forest fires also increased in frequency after redwood tanoak fires. That indicates that soil microbes are selected for by slightly different pressures than plants because the plants that regenerate post-fire in pine forests vs redwood tanoak forests are very different. It seems more likely that microbes instead survive via temperature thresholds and if fire is high severity enough, similar groups of microbes will respond. We collaborated with Kazuo Isobe to implement the CONSENTRAIT analysis and identified that microbial response to fire was indeed phylogenetically conserved, and it seemed that related groups of bacteria and fungi did indeed positively or negatively respond to fires. This will greatly enhance our ability to predict which microbes will respond to fire in any ecosystem since certain lineages seem evolutionarily adapted to survive fires. We also found that a basidiomycete yeast Basidioascus, dominated the fungal sequences at 30 days post-fire, and that had never been found before, probably because most post-fire sampling historically has been based on fruiting bodies.

Morphological diversity of soil microbes. Photo credit: Jenna Maddox

Moving forward, what are the next steps in this area of research?

I was able to leverage some of these results and results from my work sampling wildfires in Southern California chaparral to help me acquire a USDA grant from their Agricultural Microbiomes program (described here). The purpose of this grant is to characterize the traits of pyrophilous microbes and begin to get our knowledge of fire adaptation in microbes to that of plants. We understand a lot of the traits that enable plants to survive wildfires (like thick bark, vegetative resprouting, serotinous cones, etc) but we don’t have similar understanding of those traits in microbes. In order to understand these traits, Dylan Enright has begun performing biophysical trait assays on these microbes to determine their traits based on a large culture collection of pyrophilous microbes that I have been developing since I started my lab in July 2018. Over the last four years, 2 lab managers, one PhD student (Dylan Enright), 13 UCR undergraduates, and one part time laboratory technician have been involved in developing this culture collection of over 400 isolates of bacteria and fungi from burned soils from wildfires. Our goal is to characterize their traits with biophysical assays and eventually with genomics.

Have you gone back (or have you any plans to go back) to sample soils in the post-fire period? How long lasting do you think the effects of fire on microbial communities would be? 

Unfortunately, I have not been able to get this particular project funded (despite several attempts) and everything I did for this paper was completely unfunded. So I have not been able to return to these plots to sample again. I would be interested in returning to them eventually. I would predict the effects of the fire on the microbial communities could last decades if not longer, depending on if the plants themselves have been able to recover. Most of the literature on pyrophilous microbes suggests that high severity fire can have long term impacts on soil microbes that can last at least a decade or more. Given that the richness of both bacteria and fungi was reduced by up to 70% in one of our plots, I would predict it will take a long time to recover.

Describe the significance of this research for the general scientific community in one sentence.

Megafires have long lasting impacts on both plants and soil microbes alike, and it is important to understand the impacts on soil microbes since they drive plant and soil regeneration. 

Describe the significance of this research for your scientific community in one sentence.

The pyrophilous microbes that respond to a mega-fire in redwood tanoak forests are similar to those that respond to high severity wildfires in better studied pine forest systems, and the fact that they are phylogenetically conserved indicates that we will be able to predict what microbes will respond to wildfires in any system. Further, we are beginning to identify conserved trait responses that enable wildfire response that are analogous to plants and will help us bin and better understand fire adaptation traits in microbes.


Enright, D. J., Frangioso, K. M., Isobe, K., Rizzo, D. M., & Glassman, S. I. (2022). Mega-fire in redwood tanoak forest reduces bacterial and fungal richness and selects for pyrophilous taxa that are phylogenetically conserved. Molecular Ecology, 31, 2475– 2493.

Interview with the authors: Genetic data and niche modeling reveal complex interspecific interactions of invasive species with native congeners and help evaluate distribution pattern, range limits and invasion risk of the species

In a recent paper in Molecular Ecology, Espindola and Vázquez-Domínguez et al. combined comprehensive fieldwork, genetic analyses and a novel niche modeling approach to investigate population genetic patterns, distribution patterns of native and non-native red-eared slider turtle (Trachemys scripta elegans), one of the worst invasive species across the world, and its congeners. They found very little naturally occurring distribution overlap and genetic admixture between red-eared slider and other Trachemys species studied. In addition, they demonstrated that the native Trachemys species in Mexico have distinct climatic niche suitability, which probably prevents the invasion of red-eared slider in the area. However, major niche overlap was found between non-native red-eared slider and native species from different parts of the world, indicating that sites closer to ecological optima of invasive species have higher establishment risk than those closer to the niche-centre of the native species.

We sent a number of questions to lead authors of this work, Sayra Espindola and Ella Vázquez-Domínguez, to get more detail on this study.

What led to your interest in this topic / what was the motivation for this study? 

We have interest in population genetics of invasive species. In addition, Trachemys scripta elegans, one of the World’s 100 worst invasive species, is native to NE North America, and several native congeneric species are naturally distributed along the eastern coast of Mexico, which is an extraordinary scenario to test the effect of congeners on potential invasion patterns and evaluate their climatic and niche differences.

Trachemys spp (T. scripta, T. venusta, T. cataspila, T. taylori) and their distributions along the west coast of USA and Mexico. Trachemys scripta (red dots) in Mexico is non-native. A turtle trapping net is shown. Figure credit: Sayra Espindola/Ella Vázquez-Domínguez

What difficulties did you run into along the way? 

Maybe the most significant was that, at the time we did the molecular laboratory work, extracting DNA from samples that had been stored in formaldehyde (museum samples) was rather difficult, thus we could not obtain genomic data (SNPs) for those samples (extraction kits are much more efficient now). Nonetheless, we did sequence nuclear microsatellites loci, which provided adequate genetic information that enabled us to show the significant contemporary genetic differentiation present between native and non-native Trachemys scripta elegans individuals. 

What is the biggest or most surprising innovation highlighted in this study? 

There are two interesting findings. One is that non-native Trachemys scripta elegans individuals have very little naturally occurring distribution overlap and admixture with its congeners – they exhibited reduced gene flow and clear genetic separation despite having zones of contact. Also, we demonstrate that the native Trachemys species studied (T. cataspila, T. venusta) have distinct climatic niche suitability, which prevents the establishment of and displacement by the non-native Trachemys scripta elegans. Yet, as T. s. elegans has invaded and displaced native turtle species worldwide, we show that sites closer to T. s. elegans’ niche-center have higher establishment risk than those closer to the niche-center of the native species.

Moving forward, what are the next steps in this area of research?

We are working with our genomic data to identify loci under selection to evaluate the potential connection between specific genes and adaptive traits in these turtles. Considering the distinct climatic niches and distribution we found for the turtles, we are very kin to elucidate if there are adaptive differences among them. In addition, our results set the basis for future work – whole genome or gene-targeted sequencing, as well as a higher number of field-sampled individuals, would allow assessment of hybridization and specific gene introgression.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology?

We would first tell them that molecular ecology research, combining ecological fieldwork and laboratory tasks, is absolutely amazing! We recommend choosing to work with the species/taxa that you more deeply like – this makes the journey very enjoyable; and also selecting a laboratory and research group with ample experience in molecular work and analyses, while at the same time not afraid of proposing novel questions and ways of analyzing them.

What have you learned about methods and resource development over the course of this project?

In this project, we proposed and developed a novel modeling approach, in which by contrasting the niche suitability of the species, we were able to include, indirectly, the interactions that can occur when a species is introduced to habitats occupied by other species. The model is based on analyses of climatic niche suitability and the environmental centrality hypothesis, where fitness is expected to be highest in sites with environments closest to the center of the niche of the species. The development of this model and algorithms required an immense number of trials and errors, and once we had the final version, we had to again improve it after revision. The lesson then is that developing analytical models can take a lot, lot of time, but it is always worth the while!

Little climatic niche overlap between Trachemys scripta and two of its congeners, T. venusta and T. cataspila. Figure credit: Sayra Espindola/Ella Vázquez-Domínguez

Describe the significance of this research for the general scientific community in one sentence.

The distribution, range limits and potential risk of the invasion of invasive species can be evaluated with genetic information and ecological niche modeling.

Describe the significance of this research for your scientific community in one sentence.

Evaluating interspecific interactions between native and non-native closely related species with genetic information and niche modeling approach was key to determine the distribution patterns, range limits and invasion risks of Trachemys scripta elegans.

Wetlands system in the valley of Cuatrocienegas, Coahuila, Mexico, where the endemic Trachemys taylori lives. Photo credit: Ella Vázquez-Domínguez

Espindola S, Vázquez-Domínguez E, Nakamura M, Osorio-Olvera L, Martínez-Meyer E, Myers EA, Overcast I, Reid BN, Burbrink FT. 2022. Complex genetic patterns and distribution limits mediated by native congeners of the worldwide invasive red-eared slider turtle. Molecular Ecology. https://doi.org/10.1111/mec.16356.

Interview with the authors: Associations between MHC class II variation and phenotypic traits in a free-living sheep population

In a recent paper in Molecular Ecology, Huang, Dicks and colleagues analysed variation in the major histocompatibility complex (MHC) and phenotypic traits in an unmanaged population of sheep living on an island off the coast of Scotland. This population of sheep has been studied closely for more around 70 years, providing a very rare level of insight and statistical power to evolutionary genetic studies. The MHC is among the most variable parts of mammalian genomes and has long been known to be encode proteins central to the adaptive immune system. Through their analyses, Huang, Dicks and colleagues found associations with levels of circulating antibodies and variation at MHC loci.

We sent some questions to the corresponding author of this work, Wei Huang, to get more detail on this new study.


Rams in St Kilda. Photo credit, Martin Adam Stoffel.

Can you describe the significance of this research for the general scientific community in one sentence?

This study demonstrated the direct link between immune genes and antibody levels in wild populations.

What led to your interest in this topic / what was the motivation for this study? 

The major histocompatibility complex (MHC) contains a number of genes linked with immune defence in vertebrates. Associations between MHC variation and phenotypic traits or pathogens have been identified in many species. Also, selection on MHC genes has also been demonstrated in some studies. However, many previous studies only examined associations between MHC variation and a limited number of phenotypic traits or pathogens. Few of them have examined both MHC-fitness associations and MHC-trait associations. The longitudinal study of Soay sheep in St Kilda is a great system to study the associations between MHC variation and phenotypic traits and how the associations are linked with selection on MHC genes. Using three representative phenotypic traits monitored in thousands of sheep over decades, we are able to provide a full picture of MHC-trait associations in wild populations.


Can you describe the significance of this research for your scientific community in one sentence?

This study suggests associations between MHC and phenotypic traits are more likely to be found for traits more closely associated with pathogen defence than integrative traits and highlights the association between MHC variation and antibodies in wild populations.

What difficulties did you run into along the way? 

It is extremely hard to monitor populations and collect longitudinal data over decades. Thanks to our great field assistants and volunteers, the Soay sheep data has provided a good foundation. In terms of the specific study, the first difficulty is to genotype MHC in a large number of sheep. We used two steps to genotype the MHC genes. We first used genotype-by-sequencing to genotype hundreds of sheep. Then, benefiting from the high-density sheep SNP chip, we were able to use 13 SNPs to genotype MHC in the other thousands of sheep successfully.

Additionally, it is hard to choose the appropriate model. Some of our traits are not normally distributed and are also not closed to other common error structures. We instead used Bayesian statistical methods to run the analysis.

What is the biggest or most surprising innovation highlighted in this study? 

We used three representative traits to examine the associations between MHC variation and phenotypic traits. The traits included a fitness-related integrative trait, body weight, a measure of gastrointestinal parasites, faecal egg count, and level of three antibodies. All of the three traits are related to fitness. We only found associations between MHC variation and antibodies. Such results reflect the important role of MHC in immune defence in wild populations. Our study is one of the first studies to examine associations between MHC variation and multiple phenotypic traits. 

How do you think your results generalize to other systems?
Our study is based on the longitudinal study of Soay sheep. The large sample size provides great statistical power. Therefore, our results are reliable and solid. Also, we investigated phenotypic traits that have different links with immune defence. Therefore, our results can reflect the general pattern of MHC-trait associations.  

You conclude from your study that MHC variation is more likely to be associated with immune traits. How would you validate your findings for species with less rich data?

First, it is possible to use experiments to test the associations. In terms of wild populations, future studies can investigate multiple populations or multiple traits in a single population if they are restricted by the study length.

Moving forward, what are the next steps in this area of research?

Our study demonstrates that it is important to study MHC-antibody associations. Future studies should focus on immune traits rather than only examine MHC-pathogen associations. Also, previous studies are often restrained by small sample size. It would be nice if future studies could increase their sample size to strength the statistical power.


Huang, W.*, Dicks, K. L.*, Ballingall, K. T., Johnston, S. E., Sparks, A. M., Watt, K., Pilkington, J. G., & Pemberton, J. M. (2022). Associations between MHC class II variation and phenotypic traits in a free-living sheep population. Molecular Ecology, 31, 902– 915. 

*These authors contributed equally to this work

Interview with the authors

A holobiont view of island biogeography: Unravelling patterns driving the nascent diversification of a Hawaiian spider and its microbial associates

In their recent paper in Molecular Ecology, Armstrong and Perez-Lamarque et al investigated the evolution of the holobiont. The holobiont is the assemblage of species associated with a particular host organism. In the case of this study, the holobiont refers to the stick spider (Ariamnes), its microbiome and its endosymbionts. Taking advantage of the successive colonization of islands in a volcanic archipelego, Armstrong and Perez-Lamarque et al contrasted the evolutionary history of the host species to the different components of the holobiont on different islands in Hawai’i.

We sent some questions to the authors of this work and here’s what Benoît Perez-Lamarque, Rosemary Gillespie and Henrik Krehenwinkel had to say.

Ariames waikula (on the island of Hawaii). Photo credit: George Roderick

What led to your interest in this topic / what was the motivation for this study? 

Gut microbiota play multiple roles in the functioning of animal organisms. In addition, host-associated microbiota composition can be relatively conserved over time and the concept of the “holobiont” has been proposed to describe the ecological unit formed by the host and its associated microbial communities. Yet, it remains unclear how the different components of the holobiont (the hosts and the microbial communities) evolve. This is what spurred our interest. Taking advantages of the chronologically arranged series of volcanic mountains of the Hawaiian archipelago, we were able to tackle this question and could investigate how the different components of the holobiont have changed as the host spiders colonized new locations.   

Can you describe the significance of this research for the general scientific community in one sentence?

The evolution of Hawaiian spider hosts and their associated microbes are differently impacted by the dynamic environment of the volcanic archipelago.
Can you describe the significance of this research for your scientific community in one sentence.

The host and its associated microbiota may not act as a single and homogeneous unit of selection over evolutionary timescales.

Ariames waikula (on the island of Hawaii). Photo credit: George Roderick

What difficulties did you run into along the way? 

All the different components of the holobiont are not as easy to study. For instance, for the host spiders, we used double digest RAD sequencing (ddRAD) to obtain genome-wide single nucleotide polymorphism data. With such data, we could precisely reconstruct the evolutionary histories of the different spider populations in the last couple of million years and tracked the finest changes in their genetic diversity. In contrast, characterizing the composition of the microbial components is much more challenging. We used metabarcoding of a short region of the 16S rRNA gene to identify the bacteria present. However, over such short evolutionary timescales, this DNA region is too conserved to accumulate many differences between isolated populations. Therefore, we had high-resolution data for the spider hosts but comparably low-resolution data for the bacterial communities. To ensure that the observed patterns were not artefactually driven by such differences of resolutions, we complemented our analyses with a range of simulations to assess the robustness of our findings.

What is the biggest or most surprising innovation highlighted in this study? 

We find that the different components of the holobiont (the host spiders, the intracellular endosymbionts, and gut microbial communities) respond in distinct ways to the dynamic environment of the Hawaiian archipelago. While the host spiders have experienced sequential colonizations from older to younger volcanoes, resulting in a strong (phylo)genetic structuring across the archipelago’s chronosequence, the gut microbiota was largely conserved in all populations irrespectively of the archipelago’s chronosequence. More intermediately, we found different endosymbiont genera colonizing the spiders on each island. This suggests that this holobiont does not necessarily evolve as a single unit over long timescales.

In the conclusion to your study, you point out how different components of the holobiont likely contribute differently to selection/colonization history in this system. If you had unlimited resources, what would you do to strengthen this conclusion? 

We indeed suspect that the different components of the holobiont probably did not act as a single and homogeneous unit of selection during the colonization of the Hawaiian archipelago. First, it would be ideal to perform an even broader sampling, targeting more Ariamnes populations and species from older islands, to better characterize the long-term changes of the different holobiont components. Using sequencing technics with better resolution (as detailed below) would also improve our characterization of the microbial component(s) of the holobiont. Second, to properly test for selection, we should perform transplant experiments of the bacterial communities between spider populations/species and measure whether or not it impacts holobiont fitness. We would expect to find a significant impact of the transplant for the endosymbionts, but no or low impact for the gut bacterial communities of these spiders.

The geological history of Hawai’i provides a powerful system to build understanding of the evolution of holobiont. Are you aware of other systems where similar studies could be performed? (I appreciate that this is related to the previous question!).

Many other archipelagos, with similar island chronosequences, like the Canary Islands or the Society Islands, are also ideal for testing hypotheses on the evolution of holobionts. Within the Hawaiian archipelago again, we could replicate our work on other holobiont systems. For instance, among arthropods, plant feeders might rely more importantly on their microbiota for their nutrition, and this might likely translate into different patterns of holobiont evolution.

Moving forward, what are the next steps in this area of research?

As previously said, one main limitation is the low resolution of the 16S rRNA metabarcoding. This prevented us to look at the evolutionary history of the individual bacterial lineages. Using a new model, we have recently tackled this issue of low resolution (https://doi.org/10.1128/msystems.01104-21) and we reported little evidence of microbial vertical transmission in these holobionts. Yet, the next step would be to move from classical metabarcoding to metabarcoding with longer sequencing reads (e.g. the whole 16S rRNA gene) or even metagenomics. It would provide more resolution for looking at bacterial evolution and would also bring more information on the functioning of these bacterial communities (e.g. are gut microbiota contributing to the digestion of these Hawaiian spiders in natural environments?).


Armstrong, E. E.*, Perez-Lamarque, B.*, Bi, K., Chen, C., Becking, L. E., Lim, J. Y., Linderoth, T., Krehenwinkel, H., & Gillespie, R. G. (2022). A holobiont view of island biogeography: Unravelling patterns driving the nascent diversification of a Hawaiian spider and its microbial associates. Molecular Ecology, 31, 1299– 1316. 

*Authors contributed equally

Interview with the authors: Transposable elements mark a repeat-rich region associated with migratory phenotypes of willow warblers

In a recent paper in Molecular Ecology, Caballero-López and colleagues investigated the genetics of migratory behaviour in a two subspecies of willow warbler (Phylloscopus trochilus trochilus and Phylloscopus trochilus acredula). Previous work had identified several genetic markers associated with migratory behaviour in this species, but a particularly important candidate marker was unable to be mapped to previous genome assemblies. This suggested to Caballero-López et al, that the important marker may lie in a highly repetitive, and thus difficult to assemble, genomic region. Leveraging a recent genome assembly based on long-read technology and a quantitative PCR approach, Caballero-López et al found that the elusive migration marker is located in a genomic region rich in remnants of transposable elements.

We sent some questions to the primary author of this work, Violeta Caballero-López, to get some more insight and details about this exciting study.

Willow warbler male, 2017. Photo credit: Harald Ris.

What led to your interest in this topic / what was the motivation for this study? 

My research aims to shed some light on our understanding of the genetics underpinning bird migration, which is currently very poor. Passerine birds migrate alone, and they follow the same routes to wintering grounds as their parents, fully relying on genetic mechanisms.

The motivation for this specific study was to try to characterize a region in the genome which varies between two subspecies of willow warbler that present differential migration to Africa. Until now, this region was only identified as an AFLP-derived marker which failed to be mapped to the genome. However, with the use of molecular techniques such as qPCR in combination with a good quality genome assembly, we could understand the nature of this element better.

AFLP: Amplified Fragment Length Polymorphism

qPCR: Quantitative Polymerase Chain Reaction

Can you describe the significance of this research for the general scientific community in one sentence?

Repeat-rich regions which are often considered “junk DNA” might have a larger role on phenotypes and function than previously thought.


Can you describe the significance of this research for your scientific community in one sentence?

It is important to revise the role of repeat DNA on the determination of a complex trait such as the determination of bird migratory routes.

Willow warbler singing in Siberia, 2017. Photo credit: Harald Ris.

What difficulties did you run into along the way? 

For more than 20 years “WW2” has been an elusive AFLP marker, observed to be fixed in the “northern” subspecies P. t. acredula. It could only be amplified in PCR as a 154 bp fragment and then sequenced, but its nature was totally unknown. The identification and curation of this sequence as a transposon (TE) was challenging because it is an old, degraded “LTR portion” of the full element. This required a willow warbler genome built with long read sequencing techniques that provided regions of the genome rich in repeat DNA. Locating the ends of this transposon was also complicated. Alignment “breaks” serve as a detection method for the target site duplications that mark the edge of these elements. However, they could not be used in our system because these TEs appear consistently embedded within a larger block of repeats. This interfered with our estimation of age and theories about the origin of the repeat.

LTR: Long terminal repeat

What is the biggest or most surprising innovation highlighted in this study? 

The most surprising finding here is the presence of a large repeat-rich region (>12 mb) that segregates in both willow warbler subspecies. This region is characterized by several copies of the WW2 derived variant, which turned out to be part of a transposable element belonging to the endogenous retrovirus family. Furthermore, we provide solid evidence of its independence from the other polymorphic regions in chromosomes 1 and 5. As this TE seems to be inactive, and no clear functional genes have been detected on its surroundings, it remains puzzling why this region correlates with migration in the willow warbler so strongly.

You end your paper describing how it’s premature to think that the association of the WW2 derived variant has a causal role on the trait. Based on your knowledge of the warbler genome, would you care to speculate as to the actual causal basis of the phenotype?

The most supported hypothesis is that migration is a complex trait influenced by gene packages. In the case of the willow warblers, I would speculate that the repeat rich region, and not necessarily the WW2 derived variant itself, could affect migration indirectly through 1) the formation of a structural variant in a chromosome that affects gene expression 2) the trans-regulation from this region of some gene(s) elsewhere in the genome 3) the presence of an adjacent gene outside this region that we have not been able to detect in the current genome assemblies so far 4) a missed single copy gene within the repeat rich region. However, the last one is the least likely given that areas with such a repeat density rarely contain functional genes.

Have you got any ideas of how you might test the hypothesis that chromosomal rearrangements were facilitated by the presence of TEs

The most exciting possibility is to visually confirm if these rearrangements have taken place. A way to test this empirically would be to obtain a karyotype of each subspecies and combine it with fluorescence in situ hybridization (FISH). First, a probe labelling the WW2 derived variants would signal the location of the repeat-rich region. Once the location of this region is resolved, it is possible to design several fluorescent probes outside of it to determine if the chromosomal arrangement around it is maintained both in the genome of P. t. acredula and its orthologue region in P. t. trochilus.

Moving forward, what are the next steps in this area of research?

The biggest mystery within this study is the location in the genome of this repeat-rich region that contains several copies of the WW2 derived variant. One of the biggest challenges of genome assemblies is the mapping and correct location of repeat-dense sequences, and therefore future effort should be focused on targeting empirical evidence of the location of this region. Then we could get a better hint on if and/or how this region affects migration. Is it downstream or upstream of any gene complex? Is it silenced? how does its orthologue look in P. t. trochilus?

Typical working setup for the willow warbler team, 2021. Photo credit: Harald Ris.

Caballero-López, V., Lundberg, M., Sokolovskis, K., & Bensch, S. (2022). Transposable elements mark a repeat-rich region associated with migratory phenotypes of willow warblers (Phylloscopus trochilus). Molecular Ecology, 31, 1128– 1141.

Interview with the authors: How genomic data reveal cryptic species and how migration patterns maintain genetic divergence in birds?

In a recent paper in Molecular Ecology, Tang et al. investigated genetic divergence of different subspecies of pale sand martin (Riparia diluta) using genome-wide data. They found that the subspecies in Central and East Asia, which vary only gradually in morphology, broadly represent three genetically differentiated lineages. No signs of gene flow were detected between two lineages that met at the eastern edge of the Qinghai-Tibetan Plateau, which is likely due to largely different breeding and migration timing. Limited mixed ancestries were found in Mongolian populations between two lineages that might take divided migration routes around the Qinghai-Tibetan Plateau, and the authors hypothesize that selection against hybrids with nonoptimal migration routes might restrict gene flow. See the full article for more details of the study and the interview with lead author Manuel Schweizer below for more stories behind this exciting work.

Pale sand martin Riparia diluta tibetana, Mongolia, June 2018. Photo Credit: Manuel Schweizer

What led to your interest in this topic / what was the motivation for this study? I studied pale sand martin in Central Asia as part of the work on a field guide to the birds of Central Asia, which was published in 2012. I was then fascinated by the fact that the different subspecies described for this species breed in completely different environments: Central Asian steppes and semi deserts, high altitude grasslands on the Qinghai-Tibetan Plateau, or lowland subtropical China. Although it was evident that morphological identification of single individuals of the different subspecies without context is not possible, I suspected that cryptic diversity might be involved in this complex. This was corroborated by mtDNA data that we published in 2018. Together with Gerald Heckel and our PhD student Qindong Tang, I wanted to investigate this further using genome-wide data and test if gene flow is reduced in areas of potential contact between evolutionary lineages.

Breeding site of pale sand martin on the east edge of Qinghai-Tibetan Plateau in Zoige (Sichuan Province, China). Photo Credit: Qindong Tang

What difficulties did you run into along the way? The biggest challenge was to get a comprehensive geographic sampling together. As pale sand martins breed in low densities only, this meant a lot of travelling. Fortunately, we could count on the great support of our collaboration partners and their network – Yang Liu from Sun Yat-sen University in Guangzhou, China, and Gombobaatar Sundev from the National University of Mongolia. Moreover, Qindong Tang made an incredible effort and did an excellent job during the fieldwork. 

What is the biggest or most surprising innovation highlighted in this study? Given the absence of obvious sexually selected traits and only gradual morphological differentiation between the different evolutionary lineages of the pale sand martin, the level of genetic differences and the fact that they behave like different species at least at the eastern edge of the Qinghai-Tibetan Plateau is indeed surprising. So, we were left with the following question: what processes and mechanism prevent a complete mixing at secondary contact zones? We think that seasonal migration behavior might be an essential factor in maintaining genetic integrity of these morphologically cryptic evolutionary lineages.

Moving forward, what are the next steps in this area of research? The next step is evident: we need to study in detail migration behavior of the different lineages. The ranges of two of them meet in the area of a well-known avian migratory divide, where western lineages take a western migration route around the Qinghai-Tibetan Plateau to winter quarters in South Asia, and eastern lineages take an eastern route to Southeast Asia. This might also be the case in the pale sand martins and we hypothesize that hybrids might have nonoptimal intermediate migration routes and selection against them might restrict gene flow. This will need quite some field work and application of up-date technologies such as modern data loggers. Let’s hope that the development of the pandemic will allow field work again soon.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? It is best to get started and not be intimidated or even afraid. The easiest way to learn new methods is to start using them. It is also important to build a network of people who can be asked for support when problems arise.

What have you learned about methods and resource development over the course of this project? As always in studies with a phylogeographic background, sampling matters most. Try to organize a complete geographic sampling in the beginning of a project. Sampling in parts of the distribution area of our study system was planned in the third year of Qindong’s PhD, however, this could not be achieved due to the pandemic. As a consequence, we still lack samples from western Mongolia which would have been important and made the overall picture more comprehensive. This work did not include any development of new methods, however, a knowledge of state-of-the art methodological approaches is obviously always crucial.

Describe the significance of this research for the general scientific community in one sentence. Our study points towards contrasting migration behavior as an important factor in maintaining evolutionary diversity under morphological stasis.

Describe the significance of this research for your scientific community in one sentence. Our discovery of cryptic diversity in the pale sand martin indicates that evolutionary diversity might be underestimated even in such well-studied groups such as birds, and it suggests that it is worth having a closer look at widespread species occurring in different environments.

Photo of the first author Qindong Tang during the field work. Photo Credit: Qin Huang
Sampling team in Qinghai, PR China, June 2016. From left to right: Manuel Schweizer, Paul Walser Schwyzer, Yang Liu, Qin Huang, Yun Li and our driver. Photo Credit: Manuel Schweizer
Sampling team in Mongolia, June 2018. From left to right: Tuvshin Unenbat, Turmunbaatar Damba,  Gombobaatar Sundev, Paul Walser Schwyzer, Manuel Schweizer, Silvia Zumbach, Sarangua Bayrgerel. Photo Credit: Manuel Schweizer

Tang Q, Burri R, Liu Y, Suh A, Sundev G, Heckel G, Schweizer M. 2022. Seasonal migration patterns and the maintenance of evolutionary diversity in a cryptic bird radiation. Molecular Ecology. https://doi.org/10.1111/mec.16241.

Interview with the authors: How can we use machine learning and genomics to make predictions about the effects of climate change?

In a recent paper in Molecular Ecology Resources, Fitzpatrick et al. used a combination of common garden experiments, genome sequencing, and machine learning analyses to understand how genomic offsets (a measure of maladaptation) can be used to predict how organisms might respond to future environmental change. They found that genetic offset was negatively associated with growth and was a better predictor of performance than the difference in sampling site and common garden environmental variables were alone. See the full article for more details on how these trends aligned with panels of putatively informative and randomly selected SNPS, and the interview with lead author Matthew Fitzpatrick below for even more insight into this exciting work.

What led to your interest in this topic / what was the motivation for this study? My research focuses on spatial modeling of biodiversity and involves forecasting how climate change may impact natural systems. Demand for such forecasts continues to grow given the threats facing biodiversity. However, a major – and often overlooked – challenge is assessing forecasting models, which is really important given their potential to (mis)inform conservation. 

The motivation for this study was to test a type of genomics-based forecast founded on an idea that my coauthor Steve Keller and I developed a few years ago that we termed “genetic offsets”. Genetic offsets are in essence a forecast of climate maladaptation based on existing relationships between (adaptive) genomic variation and climate gradients. We tested how well genetic offsets correspond to biological responses to rapid climate change – in this case by transplanting trees from their home climate to a common garden experiment and measuring their response.

What difficulties did you run into along the way? There were all sorts of challenges one might expect with setting up and running common gardens experiments in two countries, which as the modeler on the project I, thankfully, was largely isolated from. We were lucky to have Raju Soolanayakanahally on our team to help with common garden logistics in Canada, along with Steve’s lab running the Vermont common garden. Additionally, there was the challenge of how best to evaluate the population genomic data for signatures of local adaptation prior to the genetic offset modeling. This can always be a challenge to ensure you’re minimizing the effects of population structure and false positives. Steve and his former postdoc Vikram Chhatre approached this from several angles to make sure we had a robust set of selection outliers. From the modeling perspective, we had to be creative about fitting and summarizing a very large number of machine learning models.  

What is the biggest or most surprising innovation highlighted in this study? We found pretty solid evidence that genetic offsets can serve as a meaningful estimate of the degree of expected maladaptation of populations exposed to climate change. It was nice to get some confirmation of our idea, but what was really surprising was that sets of randomly selected SNPs predicted performance of trees as well as or slightly better than did our set of carefully selected candidate SNPs, which was the opposite of what we expected. We’ve seen some other evidence in our simulation studies that also suggest SNPs from the genomic background can be predictive of maladaptation, although the reasons for this are still being investigated.

Moving forward, what are the next steps in this area of research? Ours is a single study on a single species of tree. Many more tests are needed in other study systems before we can fully understand the situations in which genetic offsets can serve a useful purpose. Also, our study tested genetic offsets derived from the machine learning method Gradient Forest, but Gradient Forest is just one of several statistical methods that can be used to estimate offsets. An important next step in my lab is to perform similar testing using another promising method known as generalized dissimilarity modeling.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? Take good notes and document the process! You will thank yourself later. If you are developing a new method, it is important to thoroughly test it to be sure you understand how it behaves in different circumstances and to make clear its intended uses before publishing on it. And last, teach others to use your method! 

What have you learned about methods and resource development over the course of this project? I thought I knew a lot about Gradient Forest and its behavior, but this study – and another we have in review on testing genetic offsets using simulated data – taught me that methods do not always behave the way we might expect or hope. And even when we have simulated “truth known” data, it can be difficult to understand why methods are behaving a certain way.

Describe the significance of this research for the general scientific community in one sentence. This study shows that for some organisms it may be possible to use genetic data to inform climate change impact assessments.

Describe the significance of this research for your scientific community in one sentence. This study provides evidence that spatial patterns of adaptive genomic variation along climatic gradients can be used to estimate the magnitude of expected maladaptation of populations exposed to rapid climate change through time.  

Fitzpatrick MC, Chhatre VE, Soolanayakanahally RY, Keller SR. 2021. Experimental support for genomic prediction of climate maladaptation using the machine learning approach Gradient Forests. Molecular Ecology Resources. https://doi.org/10.1111/1755-0998.13374.

Interview with the authors: does indoor spraying alter the genetic diversity of malaria-causing parasites and what does this mean for long-term control?

In a recent paper in Molecular Ecology, Argyropoulos and Ruybal-Pesántez et al. (2021) investigated the effects of indoor spraying on Plasmodium falciparum, the human malaria-causing protist. They find that 3 consecutive years of indoor spraying reduced transmission and prevalence of malaria by 90% and 35%, respectively, in the high malaria transmission site they surveyed. Despite these large reductions, a change in genetic diversity in P. falciparum that would indicate a large reduction in population size was not detected, illustrating the incredible resiliency of this parasite. Based on these data, the authors suggest that limiting malaria transmission in high transmission areas will require continued indoor spraying or other interventions such as mass drug administration. See the full article and interview with first authors Argyropoulos and Ruybal-Pesántez below for more details of this exciting work.

What led to your interest in this topic / what was the motivation for this study? Global efforts over the past 20 years have significantly reduced malaria mortality and morbidity around the world, but malaria transmission remains high in many countries in sub-Saharan Africa. A major challenge is the fact that most Plasmodium falciparum infections are asymptomatic creating a persistent parasite reservoir that continually fuels transmission to mosquitos. Our group has a long-standing collaboration with colleagues at the Navrongo Health Research Centre and Noguchi Memorial Institute of Medical Research in Ghana, and the University of Chicago in the US, to conduct longitudinal field-based epidemiological studies of the P. falciparumreservoir in Bongo District, Ghana (Tiedje et al., 2017). Our motivation for this study was to understand P. falciparum transmission dynamics in the context of the roll-out of a malaria control intervention by combining population genetics with more traditional epidemiological and entomological parameters. Our previous research in Bongo District established there was high levels of P. falciparum genetic diversity with no population structure (Ruybal‐Pesántez et al., 2017). We were therefore interested in exploring whether the addition of a short-term indoor residual spraying (IRS) programme against a background of widespread long-lasting insecticidal nets (LLINs) would bottleneck this P. falciparum population in Bongo and lead to reductions in diversity and changes in population structure. 

What difficulties did you run into along the way? One of the major technical limitations in P. falciparum genotyping is phasing multi-genome infections to assign multilocus haplotypes. Eighty per cent of the population of all ages where we work in Ghana have multiple diverse parasite genomes. This is  also a problem for whole genome sequencing of isolates. To get around this problem, we focus on genotyping monoclonal infections using panels of multi-allelic microsatellite markers or biallelic SNPs. In high-transmission settings like our study site in Ghana microsatellite genotyping of P. falciparum provides increased power of inference and higher resolution than biallelic SNPs (Anderson et al., 2000; Ellegren, 2004; Selkoe and Toonen, 2006).

What is the biggest or most surprising innovation highlighted in this study? In our paper, we find that despite the addition of three-rounds of IRS against a background of LLINs between 2013 – 2015, it did not lead to a population bottleneck or dramatic change in parasite genetic diversity. This was striking because IRS did achieve a >90% reduction in local malaria transmission intensity and 37.5% fewer malaria infections in the community. The potential for rebound of P. falciparum transmission is therefore highly likely if these control programmes are not implemented long-term. 

Moving forward, what are the next steps in this area of research? Population genomic approaches are increasingly being applied to enhance our understanding of epidemiology, transmission dynamics, and public health strategies for a variety of pathogens. In the malaria field, the potential of genomic data to guide control and elimination strategies has been recognized but is still in early stages with respect to its translation into general practice. In our paper, we highlight that genomic surveillance is pivotal to assess progress towards achieving the World Health Organisation Global Technical Strategy for Malaria 2016-2030 targets. Along with our collaborators in Ghana, we have conducted follow-up surveys in our study site to track the long-term implications of this IRS intervention, as well as other interventions that have been rolled out across Bongo District since 2015. We are also applying phylodynamic approaches to characterize variant antigen genes to further explore the impact of interventions on P. falciparum adaptation and fitness, as alternate but complementary surveillance metrics in this high-transmission setting. 

Dionne Argyropoulos, co-first author on this paper, is investigating the neutral and adaptive genetic diversity of P. falciparum in these follow-up surveys and in the context of other control interventions as part of her PhD research. Shazia Ruybal-Pesántez, co-first author on this paper, is now currently applying a suite of genomic epidemiology approaches to better understand residual and resurgent malaria transmission dynamics in the Asia-Pacific and Americas regions as part of her post-doctoral research.

What have you learned about methods and resources development over the course of this project? Firstly, it is important that you understand the basic principles of the concepts that you are using. It may seem rudimentary, but these principles will ensure that you are answering the scientific question that you are interested in and are maintaining scientific integrity throughout the research process. Asking for help or support from others in your field is also useful to bounce ideas and enhance your understanding of your research findings. The most exciting part of Molecular Ecology is how we utilise the insights molecular techniques to answer big picture questions. Our study integrated population genetics and genomic surveillance to address key research questions about malaria transmission and control interventions. To do this, we used existing molecular techniques (i.e., microsatellites) in new ways (i.e., to evaluate IRS over time). We also believe that it is important to not be afraid to apply novel techniques to new research questions, such as using bioinformatic tools and various packages in R.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? This project was unique as it involved field sample collection and processing, parasite genotyping, data generation and for the analysis required combining traditional epidemiological methods with population genetics and genomics approaches. When working with large sample sets and datasets, it is critical to pay attention to detail during data generation, curation and downstream analyses. Developing and strengthening coding skills was instrumental in enabling us to execute the necessary analyses of these data. We found R to be an incredibly useful resource to document our analyses and facilitate discussion and interpretation of the data with colleagues, while ensuring reproducibility of our work. We used several well-established R packages for data management and the population genetics analyses. Overall, this multidisciplinary project would not have been possible without being part of a multi-disciplinary team with a wealth of knowledge and the strong collaborations with experienced researchers in Ghana. 

Describe the significance of this research for the general scientific community in one sentence. We show how parasite genetics can be harnessed to better understand the efficacy of malaria control interventions, particularly by identifying key factors leading to parasite resilience that may not be reflected in other commonly used evaluation metrics. 

Describe the significance of this research for your scientific community in one sentence. Short-term indoor residual spraying with insecticides did not cause a dramatic change on the genetic diversity of P. falciparum in Bongo District, Ghana, therefore long-term strategies are necessary to genetically bottleneck the parasite population. 

Argyropoulos DC*, Ruybal-Pesántez S*, Deed SL, Oduro AR, Dadzie SK, Apparu MA, Asoala V, Pascual M, Koram KA, Day KP, Tredje KE. THe impact of indoor residual spraying on Plasmodium falciparum microsatellite variation in an area of high seasonal malaria transmission in Ghana, West Africa. Molecular Ecology. https://doi.org/10.1111/mec.16029. (*joint lead authors)

Joint lead authors Dionne Argyropoulos (left) and Shazia Ruybal-Pésantez (right). Photo Credits: The Stockholm International Youth Science Seminar, Unga Forskare; http://www.ungaforskare.se (left) and The Walter and Eliza Hall Institute of Medical Research; www.wehi.edu.au (right). 

Interview with the authors: genomic and phenotypic divergence between populations in translocated species

In a recent issue of Molecular Ecology, Taylor et al. explore how between population translocations of a small and endangered freshwater fish may break the long-term evolutionary boundaries between populations in this species. In this study, the researchers used a combination of genomic and phenotypic data to show that translocation efforts, which were necessary for meeting species conservation goals, could alter some important genetic and morphological differences between populations. To read the complete story, see the full article now available online as well as the interview with the authors below.

What led to your interest in this topic / what was the motivation for this study? Some excellent work with microsatellites had previously identified three populations of Bluemask Darters across their small range (Robinson et al. 2013, Cons. Gen.). One population, larger and more genetically diverse than the others, was in the Collins River, in the western portion of the range. A second population was in Rocky River, more central. A third population was in Cane Creek and the Caney Fork to the east. There was also a population in the Calfkiller River, which has been extirpated for several decades. In this context, captive-reared Bluemask Darter progeny from the Collins River population were being introduced to the Calfkiller River. But the location of the Calfkiller, near the center of the range, gave an important quirk to the system. If the three populations were not equally distinct, then Calfkiller River might be better suited with individuals from Rocky River, Cane Creek, or Caney Fork, rather than the western Collins River. In other words, the geography of the system meant that we needed to know the phylogenetic or hierarchical structure of population structure to know what boundaries might be lurking between Collins River and an introduced population in the Calfkiller River.

What difficulties did you run into along the way? One challenge in our project was navigating the connection between our scientific discoveries and the underlying goals of conservation. Our analyses were focused on the quantitative aspects of Bluemask Darters phylogenetics. However, at the end of the day, we are talking about an endangered species, incredibly imperiled, with a tiny range and an uncertain future. No quantitative value can give us strict guidance about the normative problems of conservation. So a challenge was to unpack, as best as we could, how our conclusions about the phylogenetics, population structure, and demography of this species could ultimately help us conserve the multiple diverging lineages of Bluemask Darters. The reviewers and editors from Molecular Ecology helped us refine our logic and our language, and the final result is a paper that acknowledges the complexities and competing concerns of translocation in a system like this.

What is the biggest or most surprising innovation highlighted in this study? One of the most significant findings of this study was the discovery of two divergent clades of Bluemask Darters — precisely the boundary being broken by current conservation management decisions that move fish between clades! One clade includes western individuals and the other includes eastern individuals. To top it off, we had the unique opportunity to use historic morphological data from across the range, including the Calfkiller River site where the fish had been extirpated, and which was now being restored with fish originating from the western population. The consistent result was that eastern sites harbor a distinct population from western sites, and that the Calfkiller River was associated with the eastern population. It is now apparent that translocated individuals should be from a source consistent with the clade that previously occupied the Calfkiller River, and from a source that will not artificially perturb existing evolutionary boundaries. In our study, there are additional complicating factors — the ideal eastern translocation sources are low abundance and not as genetically diverse. So our study was also a new opportunity to address how we might balance multiple concerns, with genetic details, while addressing a complicated conservation issue.

Moving forward, what are the next steps in this area of research? In our paper, we discuss how there are juvenile Bluemask Darters that drift into the reservoir at the center of the range and may not be able to migrate upstream to appropriate habitats needed as adults. These young fish are from the Rocky River, and are part of the appropriate clade for restocking the Calfkiller River. However, the success of this strategy would depend on the population dynamics of young fish in the reservoir. Jeff Simmons, co-author on this paper, and colleagues will be pushing forward with the critical next steps. There will be studies of the density and abundance of juvenile fish in the reservoir, including whether juveniles recruit into a breeding population or simply perish before maturity. There is also ongoing monitoring of the translocated fish in the Calfkiller, and across the species range. All of this work is being combined with habitat quality monitoring aimed at unraveling the location, frequency, and cause(s) of water quality issues that are harming darters in this system. All together, we’re continuing to build a picture of how best to conserve the distinct lineages of Bluemask Darters. 

What have you learned about methods and resources development over the course of this project? Making this project successful meant combining dozens of different analyses — assembling, aligning, and filtering sequences, phylogenetics, population structure, genetic differentiation statistics, demographic simulations, to name a few — each of which have their own traps and idiosyncrasies. Getting these methods working required, first, well, getting everything to run, and then getting everything to run correctly. As useful as online documentation is, I learned there is no substitute for learning with colleagues who are engaging in similar research. Shout out especially to Dan MacGuigan, Daemin Kim, and Ava Ghezelayagh, all students with Tom Near. My conversations with these and other colleagues were critical for avoiding analytical pitfalls. These conversations also spurred ideas about new analyses and perspectives that will continue moving phylogenetic and population genetic work forward. 

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? It’s been said before, but it really was important to have reproducible code for this project. Working with next-generation sequence data meant an enormous number of different files and analysis packages. Being able to switch between versions (like with git), automate programs (like with bash scripts), and manage software environments (like with conda) saved us hundreds of hours. At the end, you can neatly package everything up; all of our data and code, for example, is now stored on a dryad repository that could basically reproduce our paper from scratch in just a few commands. Even after publication, sharing code has also meant starting new conversations with other scientists about best practices, alternate methods, and new ideas for genetic analyses.

Describe the significance of this research for the general scientific community in one sentence. Our study uses genetic and morphological data to unravel how translocation strategies for an endangered freshwater fish might balance the competing conservation concerns of phylogenetic divergence, genetic diversity, and population demography.

Describe the significance of this research for your scientific community in one sentence. Our study identifies two distinct clades of endangered Bluemask Darters across their small range, where current management decisions are translocating individuals across those diverging lineages.

Taylor LU, Benavides E, Simmons JW, Near TJ. 2021. Genomic and phenotypic divergence informs translocation strategies for an endangered freshwater fish. Molecular Ecology. https://onlinelibrary.wiley.com/doi/10.1111/mec.15947.

Interview with the authors: Molecular dating for phylogenies containing a mix of populations and species by using Bayesian and RelTime approaches

Written by Beatriz Mello and Sudhir Kumar

The work presents the most extensive evaluation to date of relaxed-clock methods’ performance to infer molecular times for datasets that contain a mixture of population and species divergences. Such datasets are commonly used in phylogeography, phylodynamics, and species delimitation studies. A wide range of biological scenarios was explored, which allowed us to compare and contrast the accuracies and precisions of divergence times for a Bayesian (BEAST) and a non-Bayesian (RelTime in MEGA)  method. Results showed that both RelTime and BEAST generally perform well and that RelTime presents a reliable and computationally efficient alternative to speed up molecular dating.

Read the full text here.

Lead author Beatriz Mello.

What led to your interest in this topic / what was the motivation for this study?

Our interest in this topic was driven by a major dilemma faced by researchers when analyzing data containing molecular sequences from closely related individuals and individuals from distinct species. This is because the Bayesian framework requires a tree prior to model the inference of divergence times. There is a myriad of tree priors available, but most importantly, they either model divergence between species or intra-species divergences. Thus, the adopted tree prior will be suboptimal to describe the evolutionary process for datasets with mixed sampling. So, our question was, although misspecified, would the use of the same tree prior produce good time estimates? Also, no one has previously examined how well non-Bayesian methods perform for such datasets, as they do not require specification of priors.

What difficulties did you run into along the way? 

One of the major difficulties we faced was the computational burden of Bayesian analysis. We all know that molecular dating using Bayesian methods can be time-consuming. However, they can become onerous in computer simulation studies because many datasets need to be analyzed. Each Bayesian analysis took several hours to complete, and we had to conduct thousands of Bayesian analyses. This was not an issue with the RelTime method, which finished computing in minutes. 

What is the biggest or most surprising innovation highlighted in this study? 

Our biggest finding is that, although the tree prior will frequently be an erroneous description of biological evolution, the accuracy of time estimates is not greatly impacted for most choices of the tree prior. This is good news to researchers working with phylogenies containing a mix of population and species. On top of that, RelTime is much faster than the Bayesian approach and produces similar results. This finding is important since the amount of sequence data is increasingly growing. A fast and accurate method allows hypotheses testing to be done using different assumptions and data subsets, improving the scientific rigor and reproducibility by others.

Moving forward, what are the next steps in this area of research?

For Bayesian methods, it will be useful to develop faster approaches. However, the excellent performance of the RelTime approach that does not require prior specification is very encouraging. Evolutionary simulations employing even more diverse biological conditions and tree topologies, especially involving many sequences, will be a very useful next step, which may only be feasible with RelTime and other fast methods.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? 

Our main message for students is to realize that no method is almighty. For those aspiring to develop new methods, it is our first step to apply different methods to a diversity of datasets and examine how the results differ, why they differ, and whether we can solve the problem discovered. It is again important for those applying new methods to use different methods and scrutinize differences in results. It is not a good idea to assume that a popular protocol is better than others by default; we need to keep an open mind and make decisions with evidence.

What have you learned about methods and resources development over the course of this project?

All of us learned quite a lot about the multispecies coalescent approach by analyzing simulated data because we know the correct result. The lesson was that some methods require many assumptions and that sometimes even small changes can have a big impact, resulting in distinct evolutionary inferences. So, we need to be very careful and explore a wide range of biological assumptions. Also, there is a strong need for more realistic simulation studies.

Describe the significance of this research for the general scientific community in one sentence.

Researchers will now be able to decide which methods and approaches to apply in their particular dataset using results from this study.

Describe the significance of this research for your scientific community in one sentence.

The accuracy and precision of divergence time estimation for datasets that contain both intra- and interspecies molecular sequences is tested for slow (Bayesian) and fast (RelTime) molecular dating approaches.

References

Mello B, Tao Q, Barba-Montoya J, Kumar S. Molecular dating for phylogenies containing a mix of populations and species by using Bayesian and RelTime approaches. Mol Ecol Resour. 2021;21:122–136. https://doi.org/10.1111/1755-0998.13249