Interview with the authors: Whole-genome analysis of multiple wood ant population pairs supports similar speciation histories, but different degrees of gene flow, across their European ranges

In a recent paper in Molecular Ecology, Portinha et al. used population genomic data to analyse the speciation history of two closely related species of wood ants, Formica polyctena and F. aquilonia. Using a demographic modelling approach, the authors reconstruct the history of divergence for multiple heterospecific pairs of populations. In all cases, the authors found that there was evidence for divergence with gene flow. However, for a sympatric population pair sampled in Finland there was evidence for substantially elevated gene flow between the species. Their findings imply that population genomic analysis of speciation history may be geographically variable for particular species.

We sent some questions to Beatriz Portinha and Pierre Nouhaud, the corresponding authors of this work, to get more detail on this study.


Ant mound surface covered in ants. Photo credit: Jack Beresford

What led to your interest in this topic / what was the motivation for this study? 

Knowledge on the demographic and speciation histories is essential for understanding
contemporary genomic patterns in natural populations, which is why we wanted to
reconstruct it for the emerging Formica model system. Our study species, Formica polyctena
and F. aquilonia, are known to hybridize naturally in Southern Finland, where their hybrids
have been studied for over 10 years (Kulmuni et al., 2010; Martin-Roy et al. 2021). We
wanted to test whether a similar divergence history was consistently inferred across the
European ranges of both species, or whether the Finnish populations would stand apart,
possibly because of gene flow mediated by hybrid populations in the area.

What difficulties did you run into along the way? 

Formica polyctena and F. aquilonia had a limited genomic toolbox when we started the
project, and we initially relied on a distant and non-contiguous reference assembly.
Meanwhile, our group assembled a high quality reference genome (Nouhaud et al., 2022),
which improved the quality of our inferences.


The demographic modelling software we used, fastsimcoal2, can simulate a large panel of
evolutionary scenarios. When planning this study, we wanted to design models that
considered alternative scenarios for the divergence of the species which would be as
biologically meaningful as possible, while keeping the number of models low enough that the
project 1) would not be a huge computational burden and 2) would be executable in the
available time frame (Beatriz’s MSc. project, funded by Erasmus+ and Societas pro Fauna et
Flora Fennica). This was an especially important aspect as we used four distinct population
pairs to reconstruct the history of the two species, so each model had to be run, at least, four
different times.

What is the biggest or most surprising innovation highlighted in this study? 

We found that there was already bidirectional gene flow occurring in Finland before the
hybridization events that led to the present-day hybrid populations. This was not suspected
before, as there is no evidence in the literature, and it suggests that F. polyctena in Finland
may be admixed, which is supported by the fact that we have not found non-admixed F.
polyctena
individuals in Finland.

Moving forward, what are the next steps in this area of research?

The divergence history we inferred between F. polyctena and F. aquilonia can be used to
run simulations about the evolution of the hybrid populations, which is what we did in a
subsequent work (Nouhaud et al. 2022). In the longer run, it would also be important to
extend this work by reconstructing the divergence history of the whole F. rufa species group,
which encompasses 5 species (including F. aquilonia and F. polyctena) and where gene flow
is prevalent (Seifert, 2021).

Describe the significance of this research for the general scientific community in one sentence.

Genomes from individuals sampled thousands of kilometers apart tell the same ancient
history, while their most recent history may be different.

Describe the significance of this research for your scientific community in one sentence.

The divergence history between two species can be reliably and consistently inferred from a
small number of individuals sampled across the species’ ranges.


Portinha, B., Avril, A., Bernasconi, C., Helanterä, H., Monaghan, J., Seifert, B., Sousa, V. C., Kulmuni, J., & Nouhaud, P. (2022). Whole-genome analysis of multiple wood ant population pairs supports similar speciation histories, but different degrees of gene flow, across their European ranges. Molecular Ecology, 31, 3416– 3431.


Kulmuni, J., Seifert, B. & Pamilo, P. (2010). Segregation distortion causes large-scale
differences between male and female genomes in hybrid ants. Proceedings on the National
Academy of Sciences
, 107(16), 7371-7376.


Martin-Roy, R., Nygård, E., Nouhaud, P. & Kulmuni, J. (2021). Differences in thermal
tolerance between parental species could fuel thermal adaptation in hybrid wood ants.
American Naturalist, 198(2), 278-294.


Nouhaud, P., Beresford, J. & Kulmuni, J. (2022). Assembly of a hybrid Formica aquilonia× F.
polyctena
ant genome from a haploid male. Journal of Heredity, esac019, 1-7.


Nouhaud, P., Martin, S. H., Portinha, B., Sousa, V. C. & Kulmuni, J. (2022). Rapid and
repeatable genome evolution across three hybrid ant populations. bioRxiv.


Seifert, B. (2021). A taxonomic revision of the Palaearctic members of the Formica rufa
group (Hymenoptera: Formicidae) – the famous mound-building red wood ants.
Myrmecological News, 31, 133-179.

Interview with the authors: Associations between MHC class II variation and phenotypic traits in a free-living sheep population

In a recent paper in Molecular Ecology, Huang, Dicks and colleagues analysed variation in the major histocompatibility complex (MHC) and phenotypic traits in an unmanaged population of sheep living on an island off the coast of Scotland. This population of sheep has been studied closely for more around 70 years, providing a very rare level of insight and statistical power to evolutionary genetic studies. The MHC is among the most variable parts of mammalian genomes and has long been known to be encode proteins central to the adaptive immune system. Through their analyses, Huang, Dicks and colleagues found associations with levels of circulating antibodies and variation at MHC loci.

We sent some questions to the corresponding author of this work, Wei Huang, to get more detail on this new study.


Rams in St Kilda. Photo credit, Martin Adam Stoffel.

Can you describe the significance of this research for the general scientific community in one sentence?

This study demonstrated the direct link between immune genes and antibody levels in wild populations.

What led to your interest in this topic / what was the motivation for this study? 

The major histocompatibility complex (MHC) contains a number of genes linked with immune defence in vertebrates. Associations between MHC variation and phenotypic traits or pathogens have been identified in many species. Also, selection on MHC genes has also been demonstrated in some studies. However, many previous studies only examined associations between MHC variation and a limited number of phenotypic traits or pathogens. Few of them have examined both MHC-fitness associations and MHC-trait associations. The longitudinal study of Soay sheep in St Kilda is a great system to study the associations between MHC variation and phenotypic traits and how the associations are linked with selection on MHC genes. Using three representative phenotypic traits monitored in thousands of sheep over decades, we are able to provide a full picture of MHC-trait associations in wild populations.


Can you describe the significance of this research for your scientific community in one sentence?

This study suggests associations between MHC and phenotypic traits are more likely to be found for traits more closely associated with pathogen defence than integrative traits and highlights the association between MHC variation and antibodies in wild populations.

What difficulties did you run into along the way? 

It is extremely hard to monitor populations and collect longitudinal data over decades. Thanks to our great field assistants and volunteers, the Soay sheep data has provided a good foundation. In terms of the specific study, the first difficulty is to genotype MHC in a large number of sheep. We used two steps to genotype the MHC genes. We first used genotype-by-sequencing to genotype hundreds of sheep. Then, benefiting from the high-density sheep SNP chip, we were able to use 13 SNPs to genotype MHC in the other thousands of sheep successfully.

Additionally, it is hard to choose the appropriate model. Some of our traits are not normally distributed and are also not closed to other common error structures. We instead used Bayesian statistical methods to run the analysis.

What is the biggest or most surprising innovation highlighted in this study? 

We used three representative traits to examine the associations between MHC variation and phenotypic traits. The traits included a fitness-related integrative trait, body weight, a measure of gastrointestinal parasites, faecal egg count, and level of three antibodies. All of the three traits are related to fitness. We only found associations between MHC variation and antibodies. Such results reflect the important role of MHC in immune defence in wild populations. Our study is one of the first studies to examine associations between MHC variation and multiple phenotypic traits. 

How do you think your results generalize to other systems?
Our study is based on the longitudinal study of Soay sheep. The large sample size provides great statistical power. Therefore, our results are reliable and solid. Also, we investigated phenotypic traits that have different links with immune defence. Therefore, our results can reflect the general pattern of MHC-trait associations.  

You conclude from your study that MHC variation is more likely to be associated with immune traits. How would you validate your findings for species with less rich data?

First, it is possible to use experiments to test the associations. In terms of wild populations, future studies can investigate multiple populations or multiple traits in a single population if they are restricted by the study length.

Moving forward, what are the next steps in this area of research?

Our study demonstrates that it is important to study MHC-antibody associations. Future studies should focus on immune traits rather than only examine MHC-pathogen associations. Also, previous studies are often restrained by small sample size. It would be nice if future studies could increase their sample size to strength the statistical power.


Huang, W.*, Dicks, K. L.*, Ballingall, K. T., Johnston, S. E., Sparks, A. M., Watt, K., Pilkington, J. G., & Pemberton, J. M. (2022). Associations between MHC class II variation and phenotypic traits in a free-living sheep population. Molecular Ecology, 31, 902– 915. 

*These authors contributed equally to this work

Interview with the authors

A holobiont view of island biogeography: Unravelling patterns driving the nascent diversification of a Hawaiian spider and its microbial associates

In their recent paper in Molecular Ecology, Armstrong and Perez-Lamarque et al investigated the evolution of the holobiont. The holobiont is the assemblage of species associated with a particular host organism. In the case of this study, the holobiont refers to the stick spider (Ariamnes), its microbiome and its endosymbionts. Taking advantage of the successive colonization of islands in a volcanic archipelego, Armstrong and Perez-Lamarque et al contrasted the evolutionary history of the host species to the different components of the holobiont on different islands in Hawai’i.

We sent some questions to the authors of this work and here’s what Benoît Perez-Lamarque, Rosemary Gillespie and Henrik Krehenwinkel had to say.

Ariames waikula (on the island of Hawaii). Photo credit: George Roderick

What led to your interest in this topic / what was the motivation for this study? 

Gut microbiota play multiple roles in the functioning of animal organisms. In addition, host-associated microbiota composition can be relatively conserved over time and the concept of the “holobiont” has been proposed to describe the ecological unit formed by the host and its associated microbial communities. Yet, it remains unclear how the different components of the holobiont (the hosts and the microbial communities) evolve. This is what spurred our interest. Taking advantages of the chronologically arranged series of volcanic mountains of the Hawaiian archipelago, we were able to tackle this question and could investigate how the different components of the holobiont have changed as the host spiders colonized new locations.   

Can you describe the significance of this research for the general scientific community in one sentence?

The evolution of Hawaiian spider hosts and their associated microbes are differently impacted by the dynamic environment of the volcanic archipelago.
Can you describe the significance of this research for your scientific community in one sentence.

The host and its associated microbiota may not act as a single and homogeneous unit of selection over evolutionary timescales.

Ariames waikula (on the island of Hawaii). Photo credit: George Roderick

What difficulties did you run into along the way? 

All the different components of the holobiont are not as easy to study. For instance, for the host spiders, we used double digest RAD sequencing (ddRAD) to obtain genome-wide single nucleotide polymorphism data. With such data, we could precisely reconstruct the evolutionary histories of the different spider populations in the last couple of million years and tracked the finest changes in their genetic diversity. In contrast, characterizing the composition of the microbial components is much more challenging. We used metabarcoding of a short region of the 16S rRNA gene to identify the bacteria present. However, over such short evolutionary timescales, this DNA region is too conserved to accumulate many differences between isolated populations. Therefore, we had high-resolution data for the spider hosts but comparably low-resolution data for the bacterial communities. To ensure that the observed patterns were not artefactually driven by such differences of resolutions, we complemented our analyses with a range of simulations to assess the robustness of our findings.

What is the biggest or most surprising innovation highlighted in this study? 

We find that the different components of the holobiont (the host spiders, the intracellular endosymbionts, and gut microbial communities) respond in distinct ways to the dynamic environment of the Hawaiian archipelago. While the host spiders have experienced sequential colonizations from older to younger volcanoes, resulting in a strong (phylo)genetic structuring across the archipelago’s chronosequence, the gut microbiota was largely conserved in all populations irrespectively of the archipelago’s chronosequence. More intermediately, we found different endosymbiont genera colonizing the spiders on each island. This suggests that this holobiont does not necessarily evolve as a single unit over long timescales.

In the conclusion to your study, you point out how different components of the holobiont likely contribute differently to selection/colonization history in this system. If you had unlimited resources, what would you do to strengthen this conclusion? 

We indeed suspect that the different components of the holobiont probably did not act as a single and homogeneous unit of selection during the colonization of the Hawaiian archipelago. First, it would be ideal to perform an even broader sampling, targeting more Ariamnes populations and species from older islands, to better characterize the long-term changes of the different holobiont components. Using sequencing technics with better resolution (as detailed below) would also improve our characterization of the microbial component(s) of the holobiont. Second, to properly test for selection, we should perform transplant experiments of the bacterial communities between spider populations/species and measure whether or not it impacts holobiont fitness. We would expect to find a significant impact of the transplant for the endosymbionts, but no or low impact for the gut bacterial communities of these spiders.

The geological history of Hawai’i provides a powerful system to build understanding of the evolution of holobiont. Are you aware of other systems where similar studies could be performed? (I appreciate that this is related to the previous question!).

Many other archipelagos, with similar island chronosequences, like the Canary Islands or the Society Islands, are also ideal for testing hypotheses on the evolution of holobionts. Within the Hawaiian archipelago again, we could replicate our work on other holobiont systems. For instance, among arthropods, plant feeders might rely more importantly on their microbiota for their nutrition, and this might likely translate into different patterns of holobiont evolution.

Moving forward, what are the next steps in this area of research?

As previously said, one main limitation is the low resolution of the 16S rRNA metabarcoding. This prevented us to look at the evolutionary history of the individual bacterial lineages. Using a new model, we have recently tackled this issue of low resolution (https://doi.org/10.1128/msystems.01104-21) and we reported little evidence of microbial vertical transmission in these holobionts. Yet, the next step would be to move from classical metabarcoding to metabarcoding with longer sequencing reads (e.g. the whole 16S rRNA gene) or even metagenomics. It would provide more resolution for looking at bacterial evolution and would also bring more information on the functioning of these bacterial communities (e.g. are gut microbiota contributing to the digestion of these Hawaiian spiders in natural environments?).


Armstrong, E. E.*, Perez-Lamarque, B.*, Bi, K., Chen, C., Becking, L. E., Lim, J. Y., Linderoth, T., Krehenwinkel, H., & Gillespie, R. G. (2022). A holobiont view of island biogeography: Unravelling patterns driving the nascent diversification of a Hawaiian spider and its microbial associates. Molecular Ecology, 31, 1299– 1316. 

*Authors contributed equally

Interview with the authors: Transposable elements mark a repeat-rich region associated with migratory phenotypes of willow warblers

In a recent paper in Molecular Ecology, Caballero-López and colleagues investigated the genetics of migratory behaviour in a two subspecies of willow warbler (Phylloscopus trochilus trochilus and Phylloscopus trochilus acredula). Previous work had identified several genetic markers associated with migratory behaviour in this species, but a particularly important candidate marker was unable to be mapped to previous genome assemblies. This suggested to Caballero-López et al, that the important marker may lie in a highly repetitive, and thus difficult to assemble, genomic region. Leveraging a recent genome assembly based on long-read technology and a quantitative PCR approach, Caballero-López et al found that the elusive migration marker is located in a genomic region rich in remnants of transposable elements.

We sent some questions to the primary author of this work, Violeta Caballero-López, to get some more insight and details about this exciting study.

Willow warbler male, 2017. Photo credit: Harald Ris.

What led to your interest in this topic / what was the motivation for this study? 

My research aims to shed some light on our understanding of the genetics underpinning bird migration, which is currently very poor. Passerine birds migrate alone, and they follow the same routes to wintering grounds as their parents, fully relying on genetic mechanisms.

The motivation for this specific study was to try to characterize a region in the genome which varies between two subspecies of willow warbler that present differential migration to Africa. Until now, this region was only identified as an AFLP-derived marker which failed to be mapped to the genome. However, with the use of molecular techniques such as qPCR in combination with a good quality genome assembly, we could understand the nature of this element better.

AFLP: Amplified Fragment Length Polymorphism

qPCR: Quantitative Polymerase Chain Reaction

Can you describe the significance of this research for the general scientific community in one sentence?

Repeat-rich regions which are often considered “junk DNA” might have a larger role on phenotypes and function than previously thought.


Can you describe the significance of this research for your scientific community in one sentence?

It is important to revise the role of repeat DNA on the determination of a complex trait such as the determination of bird migratory routes.

Willow warbler singing in Siberia, 2017. Photo credit: Harald Ris.

What difficulties did you run into along the way? 

For more than 20 years “WW2” has been an elusive AFLP marker, observed to be fixed in the “northern” subspecies P. t. acredula. It could only be amplified in PCR as a 154 bp fragment and then sequenced, but its nature was totally unknown. The identification and curation of this sequence as a transposon (TE) was challenging because it is an old, degraded “LTR portion” of the full element. This required a willow warbler genome built with long read sequencing techniques that provided regions of the genome rich in repeat DNA. Locating the ends of this transposon was also complicated. Alignment “breaks” serve as a detection method for the target site duplications that mark the edge of these elements. However, they could not be used in our system because these TEs appear consistently embedded within a larger block of repeats. This interfered with our estimation of age and theories about the origin of the repeat.

LTR: Long terminal repeat

What is the biggest or most surprising innovation highlighted in this study? 

The most surprising finding here is the presence of a large repeat-rich region (>12 mb) that segregates in both willow warbler subspecies. This region is characterized by several copies of the WW2 derived variant, which turned out to be part of a transposable element belonging to the endogenous retrovirus family. Furthermore, we provide solid evidence of its independence from the other polymorphic regions in chromosomes 1 and 5. As this TE seems to be inactive, and no clear functional genes have been detected on its surroundings, it remains puzzling why this region correlates with migration in the willow warbler so strongly.

You end your paper describing how it’s premature to think that the association of the WW2 derived variant has a causal role on the trait. Based on your knowledge of the warbler genome, would you care to speculate as to the actual causal basis of the phenotype?

The most supported hypothesis is that migration is a complex trait influenced by gene packages. In the case of the willow warblers, I would speculate that the repeat rich region, and not necessarily the WW2 derived variant itself, could affect migration indirectly through 1) the formation of a structural variant in a chromosome that affects gene expression 2) the trans-regulation from this region of some gene(s) elsewhere in the genome 3) the presence of an adjacent gene outside this region that we have not been able to detect in the current genome assemblies so far 4) a missed single copy gene within the repeat rich region. However, the last one is the least likely given that areas with such a repeat density rarely contain functional genes.

Have you got any ideas of how you might test the hypothesis that chromosomal rearrangements were facilitated by the presence of TEs

The most exciting possibility is to visually confirm if these rearrangements have taken place. A way to test this empirically would be to obtain a karyotype of each subspecies and combine it with fluorescence in situ hybridization (FISH). First, a probe labelling the WW2 derived variants would signal the location of the repeat-rich region. Once the location of this region is resolved, it is possible to design several fluorescent probes outside of it to determine if the chromosomal arrangement around it is maintained both in the genome of P. t. acredula and its orthologue region in P. t. trochilus.

Moving forward, what are the next steps in this area of research?

The biggest mystery within this study is the location in the genome of this repeat-rich region that contains several copies of the WW2 derived variant. One of the biggest challenges of genome assemblies is the mapping and correct location of repeat-dense sequences, and therefore future effort should be focused on targeting empirical evidence of the location of this region. Then we could get a better hint on if and/or how this region affects migration. Is it downstream or upstream of any gene complex? Is it silenced? how does its orthologue look in P. t. trochilus?

Typical working setup for the willow warbler team, 2021. Photo credit: Harald Ris.

Caballero-López, V., Lundberg, M., Sokolovskis, K., & Bensch, S. (2022). Transposable elements mark a repeat-rich region associated with migratory phenotypes of willow warblers (Phylloscopus trochilus). Molecular Ecology, 31, 1128– 1141.

Interview with the authors: How genomic data reveal cryptic species and how migration patterns maintain genetic divergence in birds?

In a recent paper in Molecular Ecology, Tang et al. investigated genetic divergence of different subspecies of pale sand martin (Riparia diluta) using genome-wide data. They found that the subspecies in Central and East Asia, which vary only gradually in morphology, broadly represent three genetically differentiated lineages. No signs of gene flow were detected between two lineages that met at the eastern edge of the Qinghai-Tibetan Plateau, which is likely due to largely different breeding and migration timing. Limited mixed ancestries were found in Mongolian populations between two lineages that might take divided migration routes around the Qinghai-Tibetan Plateau, and the authors hypothesize that selection against hybrids with nonoptimal migration routes might restrict gene flow. See the full article for more details of the study and the interview with lead author Manuel Schweizer below for more stories behind this exciting work.

Pale sand martin Riparia diluta tibetana, Mongolia, June 2018. Photo Credit: Manuel Schweizer

What led to your interest in this topic / what was the motivation for this study? I studied pale sand martin in Central Asia as part of the work on a field guide to the birds of Central Asia, which was published in 2012. I was then fascinated by the fact that the different subspecies described for this species breed in completely different environments: Central Asian steppes and semi deserts, high altitude grasslands on the Qinghai-Tibetan Plateau, or lowland subtropical China. Although it was evident that morphological identification of single individuals of the different subspecies without context is not possible, I suspected that cryptic diversity might be involved in this complex. This was corroborated by mtDNA data that we published in 2018. Together with Gerald Heckel and our PhD student Qindong Tang, I wanted to investigate this further using genome-wide data and test if gene flow is reduced in areas of potential contact between evolutionary lineages.

Breeding site of pale sand martin on the east edge of Qinghai-Tibetan Plateau in Zoige (Sichuan Province, China). Photo Credit: Qindong Tang

What difficulties did you run into along the way? The biggest challenge was to get a comprehensive geographic sampling together. As pale sand martins breed in low densities only, this meant a lot of travelling. Fortunately, we could count on the great support of our collaboration partners and their network – Yang Liu from Sun Yat-sen University in Guangzhou, China, and Gombobaatar Sundev from the National University of Mongolia. Moreover, Qindong Tang made an incredible effort and did an excellent job during the fieldwork. 

What is the biggest or most surprising innovation highlighted in this study? Given the absence of obvious sexually selected traits and only gradual morphological differentiation between the different evolutionary lineages of the pale sand martin, the level of genetic differences and the fact that they behave like different species at least at the eastern edge of the Qinghai-Tibetan Plateau is indeed surprising. So, we were left with the following question: what processes and mechanism prevent a complete mixing at secondary contact zones? We think that seasonal migration behavior might be an essential factor in maintaining genetic integrity of these morphologically cryptic evolutionary lineages.

Moving forward, what are the next steps in this area of research? The next step is evident: we need to study in detail migration behavior of the different lineages. The ranges of two of them meet in the area of a well-known avian migratory divide, where western lineages take a western migration route around the Qinghai-Tibetan Plateau to winter quarters in South Asia, and eastern lineages take an eastern route to Southeast Asia. This might also be the case in the pale sand martins and we hypothesize that hybrids might have nonoptimal intermediate migration routes and selection against them might restrict gene flow. This will need quite some field work and application of up-date technologies such as modern data loggers. Let’s hope that the development of the pandemic will allow field work again soon.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? It is best to get started and not be intimidated or even afraid. The easiest way to learn new methods is to start using them. It is also important to build a network of people who can be asked for support when problems arise.

What have you learned about methods and resource development over the course of this project? As always in studies with a phylogeographic background, sampling matters most. Try to organize a complete geographic sampling in the beginning of a project. Sampling in parts of the distribution area of our study system was planned in the third year of Qindong’s PhD, however, this could not be achieved due to the pandemic. As a consequence, we still lack samples from western Mongolia which would have been important and made the overall picture more comprehensive. This work did not include any development of new methods, however, a knowledge of state-of-the art methodological approaches is obviously always crucial.

Describe the significance of this research for the general scientific community in one sentence. Our study points towards contrasting migration behavior as an important factor in maintaining evolutionary diversity under morphological stasis.

Describe the significance of this research for your scientific community in one sentence. Our discovery of cryptic diversity in the pale sand martin indicates that evolutionary diversity might be underestimated even in such well-studied groups such as birds, and it suggests that it is worth having a closer look at widespread species occurring in different environments.

Photo of the first author Qindong Tang during the field work. Photo Credit: Qin Huang
Sampling team in Qinghai, PR China, June 2016. From left to right: Manuel Schweizer, Paul Walser Schwyzer, Yang Liu, Qin Huang, Yun Li and our driver. Photo Credit: Manuel Schweizer
Sampling team in Mongolia, June 2018. From left to right: Tuvshin Unenbat, Turmunbaatar Damba,  Gombobaatar Sundev, Paul Walser Schwyzer, Manuel Schweizer, Silvia Zumbach, Sarangua Bayrgerel. Photo Credit: Manuel Schweizer

Tang Q, Burri R, Liu Y, Suh A, Sundev G, Heckel G, Schweizer M. 2022. Seasonal migration patterns and the maintenance of evolutionary diversity in a cryptic bird radiation. Molecular Ecology. https://doi.org/10.1111/mec.16241.

Interview with the authors: How can we use machine learning and genomics to make predictions about the effects of climate change?

In a recent paper in Molecular Ecology Resources, Fitzpatrick et al. used a combination of common garden experiments, genome sequencing, and machine learning analyses to understand how genomic offsets (a measure of maladaptation) can be used to predict how organisms might respond to future environmental change. They found that genetic offset was negatively associated with growth and was a better predictor of performance than the difference in sampling site and common garden environmental variables were alone. See the full article for more details on how these trends aligned with panels of putatively informative and randomly selected SNPS, and the interview with lead author Matthew Fitzpatrick below for even more insight into this exciting work.

What led to your interest in this topic / what was the motivation for this study? My research focuses on spatial modeling of biodiversity and involves forecasting how climate change may impact natural systems. Demand for such forecasts continues to grow given the threats facing biodiversity. However, a major – and often overlooked – challenge is assessing forecasting models, which is really important given their potential to (mis)inform conservation. 

The motivation for this study was to test a type of genomics-based forecast founded on an idea that my coauthor Steve Keller and I developed a few years ago that we termed “genetic offsets”. Genetic offsets are in essence a forecast of climate maladaptation based on existing relationships between (adaptive) genomic variation and climate gradients. We tested how well genetic offsets correspond to biological responses to rapid climate change – in this case by transplanting trees from their home climate to a common garden experiment and measuring their response.

What difficulties did you run into along the way? There were all sorts of challenges one might expect with setting up and running common gardens experiments in two countries, which as the modeler on the project I, thankfully, was largely isolated from. We were lucky to have Raju Soolanayakanahally on our team to help with common garden logistics in Canada, along with Steve’s lab running the Vermont common garden. Additionally, there was the challenge of how best to evaluate the population genomic data for signatures of local adaptation prior to the genetic offset modeling. This can always be a challenge to ensure you’re minimizing the effects of population structure and false positives. Steve and his former postdoc Vikram Chhatre approached this from several angles to make sure we had a robust set of selection outliers. From the modeling perspective, we had to be creative about fitting and summarizing a very large number of machine learning models.  

What is the biggest or most surprising innovation highlighted in this study? We found pretty solid evidence that genetic offsets can serve as a meaningful estimate of the degree of expected maladaptation of populations exposed to climate change. It was nice to get some confirmation of our idea, but what was really surprising was that sets of randomly selected SNPs predicted performance of trees as well as or slightly better than did our set of carefully selected candidate SNPs, which was the opposite of what we expected. We’ve seen some other evidence in our simulation studies that also suggest SNPs from the genomic background can be predictive of maladaptation, although the reasons for this are still being investigated.

Moving forward, what are the next steps in this area of research? Ours is a single study on a single species of tree. Many more tests are needed in other study systems before we can fully understand the situations in which genetic offsets can serve a useful purpose. Also, our study tested genetic offsets derived from the machine learning method Gradient Forest, but Gradient Forest is just one of several statistical methods that can be used to estimate offsets. An important next step in my lab is to perform similar testing using another promising method known as generalized dissimilarity modeling.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? Take good notes and document the process! You will thank yourself later. If you are developing a new method, it is important to thoroughly test it to be sure you understand how it behaves in different circumstances and to make clear its intended uses before publishing on it. And last, teach others to use your method! 

What have you learned about methods and resource development over the course of this project? I thought I knew a lot about Gradient Forest and its behavior, but this study – and another we have in review on testing genetic offsets using simulated data – taught me that methods do not always behave the way we might expect or hope. And even when we have simulated “truth known” data, it can be difficult to understand why methods are behaving a certain way.

Describe the significance of this research for the general scientific community in one sentence. This study shows that for some organisms it may be possible to use genetic data to inform climate change impact assessments.

Describe the significance of this research for your scientific community in one sentence. This study provides evidence that spatial patterns of adaptive genomic variation along climatic gradients can be used to estimate the magnitude of expected maladaptation of populations exposed to rapid climate change through time.  

Fitzpatrick MC, Chhatre VE, Soolanayakanahally RY, Keller SR. 2021. Experimental support for genomic prediction of climate maladaptation using the machine learning approach Gradient Forests. Molecular Ecology Resources. https://doi.org/10.1111/1755-0998.13374.

Summary from the authors: Estimating contemporary effective population size

Effective population size (Ne) is crucial parameter in evolutionary biology that reflects the number of individuals in a theoretically ideal population having the same magnitude of loss of genetic variation as the population in question. There are several types of Ne estimates, and they vary in definition and application. For example, contemporary Ne represents the size of a population in the previous generation/s and is a parameter of relevance in many species. Estimating contemporary Ne is, however, difficult and remains in practice often unknown. This is particularly the case for large populations where the amount of drift in the short term is limited. We used genomic data from 85 collared flycatchers of an island population sampled at two time points, and applied several methods to estimate Ne. These methods either compared genetic variation between the two time points (temporal methods) or analyzed variation patterns from a single time point (LD-based methods). The temporal methods estimated Ne at a level of few thousand, while the approach based on LD provided ambiguous estimates associated with high variance. Our results suggest that whole-genome data can help to estimate large contemporary Ne, but temporal sampling seems to be necessary.  

Article: Nadachowska-Brzyska K, Dutoit L, Smeds L, Kardos M, Gustafsson L, Ellegren H. 2021. Genomic inference of contemporary effective population size in a large island population of collared flycatchers (Ficedula albicollis). Molecular Ecology https://doi.org/10.1111/mec.16025.

Summary from the authors: landscape genetics of eastern indigo snakes

Landscape features, such as land use, vegetation cover, roads, and topography, strongly influence genetic connectivity yet these relationships can vary across spatial scales which therefore requires multi-scale approaches for evaluating landscape genetics relationships. We used the federally threatened eastern indigo snake (Drymarchon couperi), a terrestrial habitat generalist endemic to the southeastern United States, as a case study with which to evaluate the consequences of different approaches for accounting for spatial scale when optimizing genetics resistance surfaces using the software ResistanceGA. Resistance surfaces with scale selected using a true optimization approach simultaneously comparing all possible combinations of scale across each set of covariates performed better than resistance surfaces where scale was selected individually for each covariate. Truly optimized resistance surfaces also outperformed resistance surfaces based on habitat selection models and categorical land cover maps. Optimal scales were usually larger than average indigo snake home range sizes suggesting that gene flow was mediated mostly by extra-home range dispersal. Large tracts of undeveloped upland habitat with intermediate habitat heterogeneity most promoted indigo snake gene flow while roads did not appear to restrict gene flow. Our results show the importance of testing a wide range of spatial scales in landscape genetics studies. 

The top-ranked optimized genetic resistance surfaces for eastern indigo snakes in central Florida from (a) categorical land cover surfaces, (b) multi-scale habitat selection models, and (c) multi-scale landscape covariates selected using a true optimization approach. (d) shows an average resistance surface across our best-supported truly optimized resistance surfaces. (Figure 6 from Bauder et al. 2021.)

Article: Bauder JM, Peterman WE, Spear SF, Jenkins CL, Whiteley AR, McGarigal K. 2021. Multiscale assessment of functional connectivity: Landscape genetics of eastern indigo snakes in an anthropogenically fragmented landscape in central Florida. Molecular Ecology https://doi.org/10.1111/mec.15979.

Summary from the authors: inbreeding and management in captive populations

Pacific salmon hatcheries aim to supplement declining wild populations and support commercial and recreational fisheries. However, there are also risks associated with hatcheries because the captive and wild environments are inherently different. It is important to understand these risks in order to maximize the success of hatcheries. Inbreeding, which occurs when related individuals interbreed, is one risk that may inadvertently be higher in hatcheries due to space limitations and other factors. Inbred fish may have reduced fitness and survival compared to non-inbred fish. We quantified inbreeding and its effect on key fitness traits across four generations in two hatchery populations of adult Chinook salmon that were derived from the same source. We utilized recent advancements in DNA sequencing technology, which provide much more precise estimates of inbreeding and its potential effects on fitness. Our results indicate that inbreeding may not be severe in salmon hatcheries, even small ones, provided that appropriate management practices are followed. However, we documented an influence of inbreeding on the phenology of adult spawners, which could have biological implications for individual fitness and population productivity. Our findings provide a better understanding of changes that may occur in hatchery salmon and will further inform research on “best” hatchery practices to minimize potential risks. 

Article: Waters CD, Hard JJ, Fast DE, Knudsen CM, Bosch WJ, Naish KA. 2020. Genomic and phenotypic effects of inbreeding across two different hatchery management regimes in Chinook salmon. Molecular Ecology https://doi.org/10.1111/mec.15356.