Interview with the authors: Background selection and FST: Consequences for detecting local adaptation

Recent work has suggested that background selection (BGS) may lead to incorrect inferences in FST outlier studies, generating substantial concern given the prevalence of these studies in evolutionary biology. In their recent Molecular Ecology publication, Matthey‐Doret and Whitlock investigate the effects of BGS on FST outlier tests using biologically realistic simulations, and find minimal effects. Matthey-Doret and Whitlock suggest that previous studies used unrealistic parameter values in simulations, leading to an overestimate of the effects of BGS in real studies. Read the full article here: https://onlinelibrary.wiley.com/doi/pdf/10.1111/mec.15197, and get a behind-the-scenes look at this work below.

Remi Matthey‐Doret uses his new program SimBit to study the effects of background selection (BGS) on FST.

What led to your interest in this topic / what was the motivation for this study? 
It all started with a paper by Cruickshank and Hahn (2014), in which they highlight a fear that background selection could be a confounding factor to local adaptation in FST outlier studies. Curious about this issue, Mike and I investigated the question further and quickly figured that many of these fears were based on misinterpretation of Charlesworth et al. (1997). Indeed, Charlesworth et al. (1997) demonstrated that background selection can cause FST peaks for extreme and unrealistic parameter sets only. They highlighted that their parameter choice was unrealistic as their goal was to find extreme effects, but this important limitation of their study was sadly often ignored by their readers. We therefore decided to perform simulations of background selection with realistic parameter choices.

What difficulties did you run into along the way? 
The main difficulty was technical. We tried to run these simulations with a number of popular simulation softwares but none of them were fast enough for our needs. We quickly realized that we had to write our own simulation software (SimBit) that would have a very high performance especially for simulations with a lot of genetic diversity. 

What is the biggest or most surprising finding from this study? 
Starting the study, I was actually expecting that background selection would have a stronger effect on FST and that it would bias FST outlier methods to detect local adaptation. Our finding was a surprise to us, but it was also comforting to realize that the results of the many studies using FST outlier methods were probably not affected by background selection. 

Moving forward, what are the next steps for this research? 
I think there is a need for a clarified view of the relative importance of positive and negative selection in explaining patterns of genetic diversity within and between populations. Also, I would wish to investigate further the interaction between selection coefficient and migration rate and how it affects within and between population genetic diversity. Such an endeavor would likely require a mixture of empirical and theoretical work.

What would your message be for students about to start their first research projects in this topic?  
I think there is a lot of intuition about the effect of linked selection in structured populations that has not been published. Talk to smart people! They may have some expectation about how background selection can affect the coalescent tree in structured populations that needs to be studied and written out.

What have you learned about science over the course of this project? 
I learned that a lot of the numeric tools that we use to analyse genetic data contain bugs (one of which is detailed in our article) and untold (or somewhat neglected) assumptions. One must always be very careful to have a good understanding about a particular statistical software before using it.

Describe the significance of this research for the general scientific community in one sentence.
We found that background selection does not cause peaks of population differentiation and therefore that methods that use population differentiation to detect positive selection should be safe to be used without worry of background selection being a confounding factor.

Describe the significance of this research for your scientific community in one sentence.
We found that background selection does not cause much variation in locus-to-locus variation in FST and therefore FST outlier methods to detect positive selection should be safe to be used without worry of background selection being a confounding factor.

Full article:

Matthey‐Doret R, Whitlock MC. Background selection and FST: Consequences for detecting local adaptation. Mol Ecol. 2019;28:3902–3914. https://doi.org/10.1111/mec.15197.

Interview with the authors: Lack of gene flow: Narrow and dispersed differentiation islands in a triplet of Leptidea butterfly species

A diverse array of evolutionary processes contribute to diversity and divergence, and as large genomic datasets become more readily available our ability to parse apart these processes increases. In their recent Molecular Ecology publication, Talla et al. generate genomic data from six populations of wood white butterflies and use this data to try to tease apart the effects of introgression, recombination rate variation, selection, and genetic drift. In contrast to many previous genome-scan studies, they find no evidence of introgression or parallelism. Rather, they find support for genetic drift and directional selection as having shaped genomic divergence between species. Read the full article here: https://doi.org/10.1111/mec.15188, and learn more below with a behind-the-scenes interview with the authors.

What led to your interest in this topic / what was the motivation for this study? 
We have a general interest in understanding the contribution of different molecular mechanisms and evolutionary forces to genomic differentiation between diverging lineages. Previous research in this area has revealed a rather complex interaction between selection, genetic drift, recombination rate variation and introgression and we thought we had found an ideal study system to tell these factors apart. In addition, we believe that it will be key to describe the divergence landscapes in many different taxonomic groups to understand the relative importance of different molecular and evolutionary factors in lineages with different genetic/genomic features, demographic histories and life-history characteristics.

What difficulties did you run into along the way? 
Our expectations were not really met regarding the study system. First, earlier observations suggested that hybridization occurs between species pairs in the Leptidea group when they occur in sympatry, indicating that introgression might differ between sympatric and allopatric species pairs, but this turned out to be wrong. Second, butterflies lack centromeres and this could indicate a more even recombination landscape than what is generally observed in taxa with centromeres, but this we could not address with our data. Third, we expected that the divergence time between lineages was short which was again not right. Finally, the three species are characterized by large differences in karyotype, and we wanted to investigate if chromosomal rearrangements could underlie reproductive isolation, but this goal was actually out of reach with our data.

What is the biggest or most surprising finding from this study? 
It was surprising to us that there was no evidence for interspecific gene flow since hybrids have been observed. We were also very surprised by the deep divergence times between these virtually identical species. Besides that, we do not think the results are really surprising, but they do give some novel insight into the patterns of genomic divergence when there is no introgression and when chromosomes lack centromeres. One observation that we found interesting was that regions with high genetic differentiation (FST) had higher genetic divergence (DXY) than the genomic average. This may sound intuitive, but many previous ‘genome-scan’ studies have in fact found a negative relationship between differentiation and divergence, most likely as a consequence of reduced recombination in some regions leading to reduced diversity already before lineages started to diverge.

Moving forward, what are the next steps for this research? 
We are developing more resources to generate genome assemblies of multiple species in the study system and we are also working on establishing high-density linkage maps for multiple populations with different karyotypes. These tools will help us pinpoint chromosome rearrangements and investigate if these have played a role in the divergence process. The data will also be used to quantify the effects of fissions and fusions on the recombination landscape. We are also delving into other approaches to understand how ecological and behavioral differences between species leave footprints in the DNA sequences or epigenetic marks (and vice versa). Given the deep divergence times between species and the apparent lack of gene flow, we will mainly focus on intraspecific comparisons where we observe some incompatibilities between some populations with distinct karyotypes.

What would your message be for students about to start their first research projects in this topic? 
We would suggest to read up on the previous literature in detail. We also encourage students to contact leading researchers in the field to discuss potential questions. Most people are really helpful and interested in knowing about other research efforts within their field. Discussing directly with experienced researchers also gives a hint on the key questions that should be addressed to extend the knowledge in the field. Given the copious amount of data we generate these days and the integrative nature of the questions we ask, it is also crucial to develop some skills in bioinformatics and scripting and to have a network of collaborators/colleagues that can provide help and support in both theory, experimental studies and data analyses.

What have you learned about science over the course of this project? 
That science is an extremely time-consuming and dynamic process and that the first glimpse on the data not necessarily reflects the final results. Moreover, that project plans need to be worked over regularly to accommodate for that the initial strategies did not really work out as they were outlined. We also acknowledge the importance of establishing a network of colleagues with expertise in different areas of the field – we experience that most research projects within our field are getting more and more integrative and it will be increasingly difficult to conduct advanced research without collaboration.

Describe the significance of this research for the general scientific community in one sentence.
We verify that genomic differentiation between diverging lineages is affected by a complex interaction between molecular mechanisms and evolutionary forces and stress the importance of studying organisms with different genomic features, demographic histories and life-history characteristics.

Describe the significance of this research for your scientific community in one sentence.
In contrast to much of the previous work on patterns of genomic diversity and differentiation, our study provides insight into divergence processes when the effects of gene flow and/or a shared and highly variable recombination landscape are absent.

Full article: Talla V, Johansson A, Dincă V, et al. Lack of gene flow: Narrow and dispersed differentiation islands in a triplet of Leptidea butterfly species. Mol Ecol. 2019;28:3756–3770. https://doi.org/10.1111/mec.15188.