Interview with the authors: quality and quantity of genetic relatedness data affect the analysis of social structure

Understanding the influence of relatedness on fine-scale social interactions within a population is fundamental to understanding the role of kinship in animal societies. In this study, Foroughirad et al provide insight into the quality of Single Nucleotide Polymorphism (SNP) data required to obtain accurate and precise parentage assignments and relatedness coefficients using data from a long‐term behavioural study on bottlenose dolphins with a known partial pedigree. They then go on to explore how the quality of these estimates influence post-hoc analyses exploring the relationship between relatedness and social structure. Again, they provide important practical guidance about the quality of data needed for these types of analyses. This article was published in Molecular Ecology Resources: read the full article here, and read our interview with Vivienne Foroughirad, lead author of the study, below.

An adult female bottlenose dolphin with her six-month old calf. Photo Credit: Vivienne Foroughirad

What led to your interest in this topic / what was the motivation for this study? 

In the broadest sense my research interests concern the evolution of sociality and complex social behaviors such as cooperation. To that end, I was interested in our ability to parcel out contexts in which cooperation occurs between kin versus between unrelated individuals. Non-kin cooperation is rare is animal societies, and a common way to search for examples is to first investigate the link between the strength of social relationships and the genetic relatedness of pairs. Since genotyping-by-sequencing is now cheaper and more accessible than ever, I wanted to explore the effects this increased resolution would have on our power to test the relationship between social structure and relatedness, especially in viscous populations with strong philopatry.

What difficulties did you run into along the way? 

In our case, the greatest challenges centered around maintaining a longitudinal study on a wild marine mammal with a large enough sample size to make answering these types of questions feasible. We were lucky to have over 30 years of data available from the Shark Bay Dolphin Project which allowed us to verify some of the reconstructed pedigree relationships, as well as measure detailed home range usage and social associations. An analytical difficulty we encountered is how to account for the confounding effect of philopatry or limited dispersal on social relationships with kin if you want to distinguish kin discrimination from more passive kin associations that are a byproduct of shared space use. 

What is the biggest or most surprising finding from this study? 

We provided evidence that genotyping-by-sequencing methods could produce more precise relatedness values than typical microsatellite analyses, which isn’t surprising. What was less well-understood was the effect this would have for downstream analyses, such as those testing whether relatedness correlated with social affiliation. We found that even though our study species exhibits strong, life-long affiliative relationships between maternal kin, there were a surprising number of scenarios under which our analysis failed to detect a significant correlation between genetic relatedness and social associations. We also found surprisingly diminishing returns in relatedness resolution with increasing sample size (number of individuals) when small numbers of markers were used. 

Moving forward, what are the next steps for this research? 

Pedigree reconstruction is rapidly improving, especially where there is access to new genetic resources such as chromosome-level assemblies for non-model organisms. Improved kin assignment methods will allow us to investigate the function of these relationships at the level of the individual, which will help us to tease out how both intra- and inter-specific variation in ecology and demography affect social behavior. Within my own study site, I’m using these data to look at the effect of family size on social network position and reproductive success, as well as the demographic conditions that facilitate the formation of non-kin bonds. We’re also working on ways to better discriminate between maternal and paternal kin, which will be important for investigating the mechanisms of kin recognition.

What would your message be for students about to start their first research projects in this topic? 

That this is a great idea! Rapid advances in technology will open up new avenues of inquiry and there is lots of work to be done. Nevertheless, as with any field, you also need to know when to stop and submit. There will always be a new higher coverage genome or updated version of the software you’re using that’s about to be released, but if you keep reanalyzing your data with each advance, you’ll never finish a project. My second piece of advice would be to practice simulating data and analyzing it. Building simulated datasets, tweaking parameters, and testing different software has really deepened my understanding of methodologies- plus you can start before you even get your first sequencing results and be ready with a tested pipeline when you do get results in hand.

What have you learned about science over the course of this project? 

That building a robust, reproducible, and well-documented pipeline for analysis is crucial. It might take a bit more work to set up, but it’s always worth it. I also benefitted a lot from the opportunity to present my work to audiences from different disciplines which helped me keep the big picture in mind since I’m the kind of person that gets easily caught up in minutiae. Biologically, I’m always reminded that there’s so much individual variation that gets masked by conducting analyses at the population level, and that rather than being discounted as noise, that variation could be leveraged to ask really interesting questions about how ecology and demography affect behavior. 

Describe the significance of this research for the general scientific community in one sentence.

The correlation between genetic relatedness and the strength of social relationships can be masked by the limited power of typical published sample sizes.

Describe the significance of this research for your scientific community in one sentence.

We provide practical guidance for how sample sizes and sequencing methods might interact to improve precision of relatedness estimates and their effect on the analysis of social structure, using wild bottlenose dolphins as a case study.

Three juvenile male bottlenose dolphins surface synchronously. Photo Credit: Vivienne Foroughirad

Foroughirad, V., Levengood, A. L., Mann, J., & Frère, C. H. (2019). Quality and quantity of genetic relatedness data affect the analysis of social structure. Molecular Ecology Resources, 1181–1194. https://doi.org/10.1111/1755-0998.

Interview with the authors: Linking plant genes to insect communities: Identifying the genetic bases of plant traits and community composition

Much research in community genetics attempts to understand how genetic variation influences community composition, but the majority of studies have been done at the level of the genotype. In their new Molecular Ecology paper, Barker and colleagues use genome-wide association mapping in aspen (Populus tremuloides) to identify specific genes that may influence variation in tree traits or in insect communities. They uncover 49 SNPs that are significantly associated with tree traits or insect community composition. Notably, insects with closer associations with host plants have more genetic correlations than less closely associated insects. Barker and colleagues find a SNP associated with insect community diversity and the abundance of interacting species, providing a link between genetic variation in aspen and insect community composition. Finally, they find that tree traits explain some of the significant relationships between SNPs and insect community composition, suggesting a mechanism by which these genes may influence community composition. Read the full article here, and get a behind-the-scenes interview with lead author Hilary Barker below.

What led to your interest in this topic / what was the motivation for this study? 
For some time, we have been interested in extended phenotypes – the idea that the genes of an organism not only shape the immediate traits of that organism, but also extensions of these traits, such as the community of insects living on a tree. Yet, until our study, most of the previous research had been largely focused on differences across genotypes of ‘host’ organisms (e.g., aspen, cottonwoods, evening primrose), rather the underlying genes. Thus, there were a lot of unknowns yet to be discovered. For instance, would the genetic effects be large enough to detect and identify? Would more underlying tree genes be found for insects that are more closely associated with the tree (i.e., leaf gallers) rather than free feeding insects? Would there be an overlap between genes associated with insect communities and genes associated with particular tree traits? 

What difficulties did you run into along the way? 
I think the largest challenge of conducting a Genome-Wide Association study on a common garden of trees is the planting and maintenance of this small forest. We had 1824 trees that needed planting, phenotyping, and care. This work was most intense in the first four years of the study to ensure that each tree survived a summer drought and multiple harsh winters. The next most challenging hurdle was conducting the insect surveys. These surveys involved a large team effort and happened during some of the hottest days of the summer. 

What is the biggest or most surprising finding from this study? 
The most exciting finding from this study was the identification of an aspen gene (early nodulin-like [ENODL] transmembrane protein, Potra001060g09097) that underlies insect community composition; both diversity and the abundance of key insect species (aphids and ants). While we do not yet know the mechanism by which this gene influences insect communities, we do know that this protein is involved in the transportation of carbohydrates. Thus, it’s possible that this gene directly influences aphids and ants via their interactions with carbohydrate-rich honeydew, and/or indirectly influence insects via numerous tree traits, including both growth (size) and defense. To our knowledge, this is the first identification of allelic variation in a plant gene that is associated with a complex insect community trait (i.e., insect community composition).

Moving forward, what are the next steps for this research? 
The next step of this research is to explore how the genetic underpinnings of these aspen traits and associated insect communities may vary across different environmental gradients and with tree ontogeny. Previous research has shown that aspen growth and defense traits vary with tree age, and these traits play a significant role in determining which insects will feed upon the foliage. Thus, the genetic contributions of insect community composition may vary substantially for more mature trees. The Lindroth lab is currently working on an expanded version of this study with more detailed traits and mature (reproductive) trees. In addition, gene expression will vary with different environmental conditions, which will likely also modify which genes are most important in shaping insect communities.

What would your message be for students about to start their first research projects in this topic? 
To complete a large community genomics study such as this, you will need a few key things. First, you will need a lot of help. Start recruiting anyone and everyone in sight. Mentoring undergraduates will be essential and ensuring that you can effectively asses the learning of your mentees and volunteers is critical (e.g., can they correctly identify X insect? Can they successfully complete X protocol in the lab?). Second, get organized. Project management platforms can be really helpful (e.g., Asana, MS Teams, etc.) to keep track of tasks. Third, refine your R markdown scripts. You will generate more data than you know what to do with, and thus creating R scripts to clean up, organize, and analyze your data will be a top priority. Also, if you can get a digital microscope (e.g., Dino Lite), then the tedious task of keying out insect specimens will be much easier and less cumbersome! I highly recommend it.

What have you learned about science over the course of this project? 
In terms of a genome-wide association study, it is best to have as large a sample size as possible (more genotypes and genetic variation). You do not want to invest a lot of resources into a study that has low statistical power for association testing. Also, phenotype as many traits as you can. At the onset, it is impossible to know which genes, if any, will be associated with which traits. Thus, you could end up with a lot of investment while identifying a small number of associated genes, or potentially no genes at all. 

Describe the significance of this research for the general scientific community in one sentence.
Our findings show that specific genes in a host organism can shape the composition of associated communities.

Describe the significance of this research for your scientific community in one sentence.
Complex extended phenotypes such as community composition have an identifiable genetic basis, and thus we can use this information to test and study the extent and limitations of community evolution.

Full article: Barker HL, Riehl JF, Bernhardsson C, et al. Linking plant genes to insect communities: Identifying the genetic bases of plant traits and community composition. Mol Ecol. 2019;28:4404–4421. https://doi.org/10.1111/mec.15158

Summary from the authors: City life alters the gut microbiome and stable isotope profiling of the eastern water dragon (Intellagama lesueurii)

Whilst urbanisation poses a major threat to many species, there is growing evidence to suggest that some species, labelled ‘urban adapters’, are thriving within the urban landscape. Urban landscapes differ drastically from native habitats, where urban adapters are often exposed to a more diverse range of novel food items compared to their rural counterparts, which frequently includes human subsidised resources. Diet is one of the most important factors influencing the gut microbiome, an extremely influential symbiotic community that plays a critical role in many processes affecting host health and fitness, including metabolism, nutrition, immunology and development. Here, using populations of the eastern water dragon in Queensland, Australia, we explore the link between urbanisation, diet and gut microbial changes. We show that city dragons exhibit a more diverse gut microbiome than their rural counterparts, and display microbial signatures of a diet that is richer in plant-material and higher in fat. Elevated levels of the Nitrogen-15 isotope in the blood of city dragons also suggests their diet may be richer in protein. These results highlight that urbanisation can have pronounced effects on the gut microbial communities of wild animals, but we do not yet know the possible repercussions of these microbial changes.

Full article: Littleford‐Colquhoun BL, Weyrich LS, Kent N, Frere CH. City life alters the gut microbiome and stable isotope profiling of the eastern water dragon (Intellagama lesueurii). Mol Ecol. 2019;28:4592–4607. https://doi.org/10.1111/mec.15240

Summary from the authors: Host plant associations and geography interact to shape diversification in a specialist insect herbivore

The sexual generation female of the cynipid gall wasp Belonocnema treatae, which forms locally adapted populations on three closely related live oaks across the southeastern United States. Photo credit: Jena Johnson.

In this study, we wanted to know how geography and ecology predicted population genetic structure among 58 populations of the gall wasp Belonocnema treatae, which exhibits regional specialization on three host plant species across the U.S. Gulf Coast. We combined range-wide sampling with a genotype-by-sequencing approach for 40,699 SNPs across 1,217 individuals. Disentangling the processes underlying geographic and environmental patterns of biodiversity is challenging, as such patterns emerge from eco‐evolutionary processes confounded by spatial autocorrelation among sample units. We evaluated this question using a hierarchical Bayesian model (ENTROPY) to assign individuals to genetic clusters and estimate admixture proportions. Using distance-based Moran’s eigenvector mapping, we generated regression variables that represent varying degrees of spatial autocorrelation in genetic variation among sample sites. These spatial variables, along with host association, were incorporated in distance-based redundancy analysis (dbRDA) to partition the relative contributions of host plant and spatial autocorrelation. This novel approach of combining ENTROPY results with dbRDA to analyze SNP data unveiled a complex mosaic of diversification within and among insect populations forming discrete host associated lineages coupled with geographic variation. This demonstrates that geography and ecology play significant roles in explaining patterns of genomic variation in B. treatae – an emerging model of ecological speciation.

Full article: Driscoe AL, Nice CC, Busbee RW,Hood GR, Egan SP, Ott JR. Host plant associations and geography interact to shape diversification in a specialist insect herbivore. Mol Ecol. 2019;28:4197–4211. https://doi.org/10.1111/mec.15220

Summary from the authors: Polygenic selection drives the evolution of convergent transcriptomic landscapes across continents within a Nearctic sister-species complex

Lake Whitefish (Coregonus clupeaformis)
Adult size benthic (normal) and limnetic (dwarf) sympatric whitefish
Sampled from Cliff Lake, Maine, USA.
Illustration by Paul Vecsei.

The Lake Whitefish (Coregonus clupeaformis) is found in Nearctic post-glacial freshwater lakes, and diverged about 500,000 years ago from its sister species, the European Whitefish (Coregonus lavaretus) found throughout Northern Europe, European Alpine lakes and Russia. These two lineages underwent an adaptive radiation following the last glaciation, resulting in the sympatric occurrence of limnetic and benthic species-pairs. Decades of research has aimed to decipher the process of adaptive divergence and ecological speciation in the Whitefish species complex. Here, we compared independent diverging species-pairs from the two continents to elucidate the genomic and transcriptomic bases associated with the benthic-limnetic diversification. We used a statistical framework to detect polygenic targets of selection associated with phenotypic diversification. We identified a subset of genes that showed convergent patterns of differential expression between limnetic and benthic species across both continents. Those adaptive divergent genes retained a higher degree of shared polymorphism among species-pairs, most likely due to balancing selection, and this genetic variation was associated with changes in levels of gene expression between species. As such, our results indicate that standing genetic variation underlying phenotypes involved in the ecological speciation of the whitefish species-pairs has been partly maintained in parallel across both continents for at least half a million years.

-Clément Rougeux – Postdoctoral Researcher, University of Calgary

Full article: Rougeux C, Gagnaire P‐A, Praebel K, Seehausen O, Bernatchez L. Polygenic selection drives the evolution of convergent transcriptomic landscapes across continents within a Nearctic sister species complex. Mol Ecol. 2019;28:4388–4403. https://doi.org/10.1111/mec.15226

Interview with the authors: glacial refugia and the dispersal of terrestrial invertebrates

Antarctica is an extreme and isolated environment that supports a variety of species. However, we know little about how terrestrial species survive in these kinds of conditions. In a recent paper in Molecular Ecology, McGaughran and colleagues investigated a widespread group of terrestrial invertebrates to understand how species have persisted in this harsh environment. These researchers found that there were many local clusters of individuals with substantially more long-distance dispersal events than were previously identified. These long-distance dispersers were likely aided by wind, providing an interesting example of the link between environmental conditions and population stability. For more information, please see the full article and the interview with McGaughran, lead author of the study, below. 

Antarctic Peninsula taken near the tip. Photo created by Dr. Ceridwen Fraser.

What led to your interest in this topic / what was the motivation for this study? 
During my PhD, I researched genetic and physiological diversity of Antarctic terrestrial invertebrates, spending a collective ~6 months on the ice.  I then stepped away from Antarctic research for several years, completing postdocs in Germany and Australia, but I never forgot my time in Antarctica or my love for its unique environment.  Thus, I’ve maintained collaborative links that have allowed me to continue to contribute to Antarctic research.  In this study, we wanted to see whether genomic data would give us greater insight to the evolutionary history of invertebrates along the Antarctic Peninsula than had been gained with single-gene analysis in the past.  

What difficulties did you run into along the way? 
Getting workable quantities of DNA from tiny (~1 mm) springtails to use in genomic applications is difficult.  In fact, for this study, we tried to extract DNA from several Antarctic springtail species, but were only successful in our attempts with Cryptopygus antarcticus antarcticus.  Low DNA concentrations can also mean that the genomic data we end up with for analysis is patchy.  These aspects provide some challenges, but the methodologies underlying library preparation and sequencing are continually improving and we are excited about the potential of applying genomic methodologies to more Antarctic taxa in the future.

What is the biggest or most surprising finding from this study? 
Using genome-wide data, we were able to find evidence for a greater frequency of dispersal events than had been previously shown with single-gene data.  This was particularly surprising because dispersal for Antarctic invertebrates is hard.  These animals live under the rocks in moist ice-free areas.  As soon as they leave the relative safety of the soil column, they are exposed to freezing and desiccating conditions.  Thus, though we have some evidence to suggest that springtails can survive for short periods in humid air columns or floating on water, our expectation is that such events would be rare.  Finding genetic evidence that suggested several instances of successful dispersal over extremely long geographic distances was therefore surprising.

Moving forward, what are the next steps for this research? 
Much of the Antarctic literature focused toward understanding evolutionary and biogeographic questions has been based on single-gene analyses because genomic approaches are still relatively new.  This previous work has been informative about the fact that many Antarctic terrestrial species have survived glaciation in refugia, but there is much that remains to be discovered.  Antarctica is a kind of barometer for the rest of the world and it is important that we understand how species there have responded to environmental change in the past and how they may do so in the future.  Thus, key to extending this research will be to bring genomic approaches to bear on other populations and species in Antarctica.  This will help us to gain an understanding of how isolated Antarctica really is, and how its endemic species will likely respond to future environmental changes.

What would your message be for students about to start their first research projects in this topic? 
In this genomic and associated bioinformatic era, learning the skills of a well-rounded biologist who has a breadth of understanding that spans the field, the laboratory, and the computer, can be daunting.  As you develop or use novel techniques in Molecular Ecology, my message would be to stick with it through the hard stuff.  It is such an exciting time to be an evolutionary biologist and, though it can involve some really tough moments, the revelations we can achieve about how the world works are key.  Alongside this, I would suggest that collaboration is now more important than ever – don’t feel like you have to reinvent the wheel or be an expert on every single aspect of your research.  Instead, develop your own niche and share in the expertise of those around you to do the best science together.

What have you learned about science over the course of this project? 
When I first started doing research, there was no such thing as genomics or next generation sequencing and we simply didn’t have the means to gain genome-wide data.  In recent years, the face of evolutionary biology has changed due to the revolution in sequencing technology and bioinformatics.  As exemplified by this project, I’ve learned that genomic data can provide new and more nuanced insights into our biological questions of interest.  And, though it can be hard at times to work in such a swift-moving area of research, it is ultimately very rewarding.

Describe the significance of this research for the general scientific community in one sentence.
The environment, especially wind, plays an important role in structuring patterns of genetic diversity among Antarctic populations – thus future climatic changes are likely to have a significant impact on the distribution and diversity of these populations.  

Describe the significance of this research for your scientific community in one sentence.
Bringing genomic data to bear on long-standing evolutionary questions in Antarctica is a worthwhile and fruitful endeavour that will ultimately produce greater insights into understanding and protecting Antarctic taxa.

Dry Valleys taken in the Antarctic Dry Valleys. Photo created by Dr. Angela McGaughran.

McGaughran A, Terauds A, Convey P, Fraser CI. 2019. Genome‐wide SNP data reveal improved evidence for Antarctic glacial refugia and dispersal of terrestrial invertebrates. Molecular Ecology. 28:4941-4957. https://doi.org/10.1111/mec.15269.

Interview with the authors: Parent and offspring genotypes influence gene expression in early life

Early life stress can often have long-term fitness effects on organisms, and the molecular mechanisms behind this have long been of interest to biologists. While much work has demonstrated that changes in DNA methylation patterns are involved, the transcriptional effects of early life stress are less well-understood, particularly at a genome-wide level. In a recent Molecular Ecology paper, Daniel J. Newhouse and colleagues investigate the transcriptional effects of different parental care strategies in white-throated sparrows. In white-throated sparrows, there are two morphs, and two associated mating pair types: tan male x white female (TxW) pairs, and white female x tan male (WxT) pairs. While TxW pairs provide biparental care, WxT pairs provide female-biased parental care. Newhouse and colleagues use RNA sequencing to assess the transcriptional effects of these differences in parental care strategies. They find evidence of an elevated stress response in offspring of WxT pairs, which provide female-biased parental care. For more information, read the full article, and see the in-depth interview with the authors below.

 A white morph female white-throated sparrow feeding her nestlings. Photo credit: Tiffany Deater.

What led to your interest in this topic / what was the motivation for this study? 
Early in graduate school, I participated in the white-throated sparrow genome sequencing project. That project was my crash course in white-throated sparrow biology, and the unique genetics and associated behaviors of the sparrows fascinated me. Most work on white-throated sparrows focuses on the adults, but nestlings are relatively understudied.  Depending on the adult pair type of the nest, nestlings will either receive biparental care (parents=tan morph male & white morph female) or female-biased parental care (parents=white morph male & tan morph female). Essentially, I wanted to see how this parental care variation impacted the nestlings.

What difficulties did you run into along the way? 
Finding white-throated sparrow nests was much harder than I ever imagined. Hiking through bogs while fighting off swarms of biting insects made it even more difficult. Thankfully, I have wonderful collaborators who are amazing at finding nests.
Also, when we designed this study, there weren’t many examples of RNA-seq from bird blood. White-throated sparrow nestlings are very small, so the amount of blood we can collect is quite small. RNA extractions proved more difficult than expected, but we managed to sequence a sufficient amount for the study.

What is the biggest or most surprising finding from this study? 
It was surprising to see a solid signature of morph-specific gene expression. As adults, there are many differences in the transcriptome between morphs and these correlate strongly with their behavior. White-morph and tan-morph nestlings look the same and do not exhibit any morph-specific behaviors like we see in adults. Despite this, we found that a large number of genes found within the chromosomal inversion are differentially regulated. Some of these genes have also been previously identified in the brain of adult white-throated sparrows. It was cool to see the same genes appear very early in life and in a much different tissue (blood).

Moving forward, what are the next steps for this research? 
From a genomics perspective, it would be great to identify the regulatory mechanisms underlying the gene expression signatures we identified here. Additionally, within a single nest, there are both white morph and tan morph nestlings. This allows us to look at nestling morph specific responses to variation in parental care. We identified some differences between the morphs within a nest, but were ultimately limited by sample size to discuss this in depth. I think this will be a really interesting topic to explore further.

What would your message be for students about to start their first research projects in this topic? 
I suggest pursuing integrative projects, like much of the work published in Molecular Ecology. Associated with that, I suggest networking and establishing collaborations early. We can’t all be experts in everything, so collaborating with research groups that complement your interests can be beneficial.
More generally, keep up with the literature as much as you can. The more you know about your system and anything related to it, the better. Don’t forget to read up on methods papers, too. Data analysis is very important so having a grasp on analytical concepts will really help.

What have you learned about science over the course of this project? 
There’s no universal way to analyze data. There are so many tools to process genomic data, so it can be overwhelming at times to keep track of everything. I also learned that data analysis takes much longer than you plan. Inevitably something won’t work, so keeping a positive attitude throughout is crucial.

Describe the significance of this research for the general scientific community in one sentence.
Parental genotype is correlated with a transcriptomic stress response in their offspring.

Describe the significance of this research for your scientific community in one sentence.
Half of all adult white-throated sparrow pairs provide female-biased parental care and this stable parental care strategy induces a transcriptomic stress response in their offspring.

Newhouse DJ, Barcelo‐Serra M, Tuttle EM, Gonser RA, Balakrishnan CN. Parent and offspring genotypes influence gene expression in early life. Mol Ecol. 2019;28:4166–4180. https://doi.org/10.1111/mec.15205.

Interview with the authors: Background selection and FST: Consequences for detecting local adaptation

Recent work has suggested that background selection (BGS) may lead to incorrect inferences in FST outlier studies, generating substantial concern given the prevalence of these studies in evolutionary biology. In their recent Molecular Ecology publication, Matthey‐Doret and Whitlock investigate the effects of BGS on FST outlier tests using biologically realistic simulations, and find minimal effects. Matthey-Doret and Whitlock suggest that previous studies used unrealistic parameter values in simulations, leading to an overestimate of the effects of BGS in real studies. Read the full article here: https://onlinelibrary.wiley.com/doi/pdf/10.1111/mec.15197, and get a behind-the-scenes look at this work below.

Remi Matthey‐Doret uses his new program SimBit to study the effects of background selection (BGS) on FST.

What led to your interest in this topic / what was the motivation for this study? 
It all started with a paper by Cruickshank and Hahn (2014), in which they highlight a fear that background selection could be a confounding factor to local adaptation in FST outlier studies. Curious about this issue, Mike and I investigated the question further and quickly figured that many of these fears were based on misinterpretation of Charlesworth et al. (1997). Indeed, Charlesworth et al. (1997) demonstrated that background selection can cause FST peaks for extreme and unrealistic parameter sets only. They highlighted that their parameter choice was unrealistic as their goal was to find extreme effects, but this important limitation of their study was sadly often ignored by their readers. We therefore decided to perform simulations of background selection with realistic parameter choices.

What difficulties did you run into along the way? 
The main difficulty was technical. We tried to run these simulations with a number of popular simulation softwares but none of them were fast enough for our needs. We quickly realized that we had to write our own simulation software (SimBit) that would have a very high performance especially for simulations with a lot of genetic diversity. 

What is the biggest or most surprising finding from this study? 
Starting the study, I was actually expecting that background selection would have a stronger effect on FST and that it would bias FST outlier methods to detect local adaptation. Our finding was a surprise to us, but it was also comforting to realize that the results of the many studies using FST outlier methods were probably not affected by background selection. 

Moving forward, what are the next steps for this research? 
I think there is a need for a clarified view of the relative importance of positive and negative selection in explaining patterns of genetic diversity within and between populations. Also, I would wish to investigate further the interaction between selection coefficient and migration rate and how it affects within and between population genetic diversity. Such an endeavor would likely require a mixture of empirical and theoretical work.

What would your message be for students about to start their first research projects in this topic?  
I think there is a lot of intuition about the effect of linked selection in structured populations that has not been published. Talk to smart people! They may have some expectation about how background selection can affect the coalescent tree in structured populations that needs to be studied and written out.

What have you learned about science over the course of this project? 
I learned that a lot of the numeric tools that we use to analyse genetic data contain bugs (one of which is detailed in our article) and untold (or somewhat neglected) assumptions. One must always be very careful to have a good understanding about a particular statistical software before using it.

Describe the significance of this research for the general scientific community in one sentence.
We found that background selection does not cause peaks of population differentiation and therefore that methods that use population differentiation to detect positive selection should be safe to be used without worry of background selection being a confounding factor.

Describe the significance of this research for your scientific community in one sentence.
We found that background selection does not cause much variation in locus-to-locus variation in FST and therefore FST outlier methods to detect positive selection should be safe to be used without worry of background selection being a confounding factor.

Full article:

Matthey‐Doret R, Whitlock MC. Background selection and FST: Consequences for detecting local adaptation. Mol Ecol. 2019;28:3902–3914. https://doi.org/10.1111/mec.15197.

Interview with the authors: Lack of gene flow: Narrow and dispersed differentiation islands in a triplet of Leptidea butterfly species

A diverse array of evolutionary processes contribute to diversity and divergence, and as large genomic datasets become more readily available our ability to parse apart these processes increases. In their recent Molecular Ecology publication, Talla et al. generate genomic data from six populations of wood white butterflies and use this data to try to tease apart the effects of introgression, recombination rate variation, selection, and genetic drift. In contrast to many previous genome-scan studies, they find no evidence of introgression or parallelism. Rather, they find support for genetic drift and directional selection as having shaped genomic divergence between species. Read the full article here: https://doi.org/10.1111/mec.15188, and learn more below with a behind-the-scenes interview with the authors.

What led to your interest in this topic / what was the motivation for this study? 
We have a general interest in understanding the contribution of different molecular mechanisms and evolutionary forces to genomic differentiation between diverging lineages. Previous research in this area has revealed a rather complex interaction between selection, genetic drift, recombination rate variation and introgression and we thought we had found an ideal study system to tell these factors apart. In addition, we believe that it will be key to describe the divergence landscapes in many different taxonomic groups to understand the relative importance of different molecular and evolutionary factors in lineages with different genetic/genomic features, demographic histories and life-history characteristics.

What difficulties did you run into along the way? 
Our expectations were not really met regarding the study system. First, earlier observations suggested that hybridization occurs between species pairs in the Leptidea group when they occur in sympatry, indicating that introgression might differ between sympatric and allopatric species pairs, but this turned out to be wrong. Second, butterflies lack centromeres and this could indicate a more even recombination landscape than what is generally observed in taxa with centromeres, but this we could not address with our data. Third, we expected that the divergence time between lineages was short which was again not right. Finally, the three species are characterized by large differences in karyotype, and we wanted to investigate if chromosomal rearrangements could underlie reproductive isolation, but this goal was actually out of reach with our data.

What is the biggest or most surprising finding from this study? 
It was surprising to us that there was no evidence for interspecific gene flow since hybrids have been observed. We were also very surprised by the deep divergence times between these virtually identical species. Besides that, we do not think the results are really surprising, but they do give some novel insight into the patterns of genomic divergence when there is no introgression and when chromosomes lack centromeres. One observation that we found interesting was that regions with high genetic differentiation (FST) had higher genetic divergence (DXY) than the genomic average. This may sound intuitive, but many previous ‘genome-scan’ studies have in fact found a negative relationship between differentiation and divergence, most likely as a consequence of reduced recombination in some regions leading to reduced diversity already before lineages started to diverge.

Moving forward, what are the next steps for this research? 
We are developing more resources to generate genome assemblies of multiple species in the study system and we are also working on establishing high-density linkage maps for multiple populations with different karyotypes. These tools will help us pinpoint chromosome rearrangements and investigate if these have played a role in the divergence process. The data will also be used to quantify the effects of fissions and fusions on the recombination landscape. We are also delving into other approaches to understand how ecological and behavioral differences between species leave footprints in the DNA sequences or epigenetic marks (and vice versa). Given the deep divergence times between species and the apparent lack of gene flow, we will mainly focus on intraspecific comparisons where we observe some incompatibilities between some populations with distinct karyotypes.

What would your message be for students about to start their first research projects in this topic? 
We would suggest to read up on the previous literature in detail. We also encourage students to contact leading researchers in the field to discuss potential questions. Most people are really helpful and interested in knowing about other research efforts within their field. Discussing directly with experienced researchers also gives a hint on the key questions that should be addressed to extend the knowledge in the field. Given the copious amount of data we generate these days and the integrative nature of the questions we ask, it is also crucial to develop some skills in bioinformatics and scripting and to have a network of collaborators/colleagues that can provide help and support in both theory, experimental studies and data analyses.

What have you learned about science over the course of this project? 
That science is an extremely time-consuming and dynamic process and that the first glimpse on the data not necessarily reflects the final results. Moreover, that project plans need to be worked over regularly to accommodate for that the initial strategies did not really work out as they were outlined. We also acknowledge the importance of establishing a network of colleagues with expertise in different areas of the field – we experience that most research projects within our field are getting more and more integrative and it will be increasingly difficult to conduct advanced research without collaboration.

Describe the significance of this research for the general scientific community in one sentence.
We verify that genomic differentiation between diverging lineages is affected by a complex interaction between molecular mechanisms and evolutionary forces and stress the importance of studying organisms with different genomic features, demographic histories and life-history characteristics.

Describe the significance of this research for your scientific community in one sentence.
In contrast to much of the previous work on patterns of genomic diversity and differentiation, our study provides insight into divergence processes when the effects of gene flow and/or a shared and highly variable recombination landscape are absent.

Full article: Talla V, Johansson A, Dincă V, et al. Lack of gene flow: Narrow and dispersed differentiation islands in a triplet of Leptidea butterfly species. Mol Ecol. 2019;28:3756–3770. https://doi.org/10.1111/mec.15188.