Interview with the authors: How can we use machine learning and genomics to make predictions about the effects of climate change?

In a recent paper in Molecular Ecology Resources, Fitzpatrick et al. used a combination of common garden experiments, genome sequencing, and machine learning analyses to understand how genomic offsets (a measure of maladaptation) can be used to predict how organisms might respond to future environmental change. They found that genetic offset was negatively associated with growth and was a better predictor of performance than the difference in sampling site and common garden environmental variables were alone. See the full article for more details on how these trends aligned with panels of putatively informative and randomly selected SNPS, and the interview with lead author Matthew Fitzpatrick below for even more insight into this exciting work.

What led to your interest in this topic / what was the motivation for this study? My research focuses on spatial modeling of biodiversity and involves forecasting how climate change may impact natural systems. Demand for such forecasts continues to grow given the threats facing biodiversity. However, a major – and often overlooked – challenge is assessing forecasting models, which is really important given their potential to (mis)inform conservation. 

The motivation for this study was to test a type of genomics-based forecast founded on an idea that my coauthor Steve Keller and I developed a few years ago that we termed “genetic offsets”. Genetic offsets are in essence a forecast of climate maladaptation based on existing relationships between (adaptive) genomic variation and climate gradients. We tested how well genetic offsets correspond to biological responses to rapid climate change – in this case by transplanting trees from their home climate to a common garden experiment and measuring their response.

What difficulties did you run into along the way? There were all sorts of challenges one might expect with setting up and running common gardens experiments in two countries, which as the modeler on the project I, thankfully, was largely isolated from. We were lucky to have Raju Soolanayakanahally on our team to help with common garden logistics in Canada, along with Steve’s lab running the Vermont common garden. Additionally, there was the challenge of how best to evaluate the population genomic data for signatures of local adaptation prior to the genetic offset modeling. This can always be a challenge to ensure you’re minimizing the effects of population structure and false positives. Steve and his former postdoc Vikram Chhatre approached this from several angles to make sure we had a robust set of selection outliers. From the modeling perspective, we had to be creative about fitting and summarizing a very large number of machine learning models.  

What is the biggest or most surprising innovation highlighted in this study? We found pretty solid evidence that genetic offsets can serve as a meaningful estimate of the degree of expected maladaptation of populations exposed to climate change. It was nice to get some confirmation of our idea, but what was really surprising was that sets of randomly selected SNPs predicted performance of trees as well as or slightly better than did our set of carefully selected candidate SNPs, which was the opposite of what we expected. We’ve seen some other evidence in our simulation studies that also suggest SNPs from the genomic background can be predictive of maladaptation, although the reasons for this are still being investigated.

Moving forward, what are the next steps in this area of research? Ours is a single study on a single species of tree. Many more tests are needed in other study systems before we can fully understand the situations in which genetic offsets can serve a useful purpose. Also, our study tested genetic offsets derived from the machine learning method Gradient Forest, but Gradient Forest is just one of several statistical methods that can be used to estimate offsets. An important next step in my lab is to perform similar testing using another promising method known as generalized dissimilarity modeling.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? Take good notes and document the process! You will thank yourself later. If you are developing a new method, it is important to thoroughly test it to be sure you understand how it behaves in different circumstances and to make clear its intended uses before publishing on it. And last, teach others to use your method! 

What have you learned about methods and resource development over the course of this project? I thought I knew a lot about Gradient Forest and its behavior, but this study – and another we have in review on testing genetic offsets using simulated data – taught me that methods do not always behave the way we might expect or hope. And even when we have simulated “truth known” data, it can be difficult to understand why methods are behaving a certain way.

Describe the significance of this research for the general scientific community in one sentence. This study shows that for some organisms it may be possible to use genetic data to inform climate change impact assessments.

Describe the significance of this research for your scientific community in one sentence. This study provides evidence that spatial patterns of adaptive genomic variation along climatic gradients can be used to estimate the magnitude of expected maladaptation of populations exposed to rapid climate change through time.  

Fitzpatrick MC, Chhatre VE, Soolanayakanahally RY, Keller SR. 2021. Experimental support for genomic prediction of climate maladaptation using the machine learning approach Gradient Forests. Molecular Ecology Resources. https://doi.org/10.1111/1755-0998.13374.

Interview with the authors: does indoor spraying alter the genetic diversity of malaria-causing parasites and what does this mean for long-term control?

In a recent paper in Molecular Ecology, Argyropoulos and Ruybal-Pesántez et al. (2021) investigated the effects of indoor spraying on Plasmodium falciparum, the human malaria-causing protist. They find that 3 consecutive years of indoor spraying reduced transmission and prevalence of malaria by 90% and 35%, respectively, in the high malaria transmission site they surveyed. Despite these large reductions, a change in genetic diversity in P. falciparum that would indicate a large reduction in population size was not detected, illustrating the incredible resiliency of this parasite. Based on these data, the authors suggest that limiting malaria transmission in high transmission areas will require continued indoor spraying or other interventions such as mass drug administration. See the full article and interview with first authors Argyropoulos and Ruybal-Pesántez below for more details of this exciting work.

What led to your interest in this topic / what was the motivation for this study? Global efforts over the past 20 years have significantly reduced malaria mortality and morbidity around the world, but malaria transmission remains high in many countries in sub-Saharan Africa. A major challenge is the fact that most Plasmodium falciparum infections are asymptomatic creating a persistent parasite reservoir that continually fuels transmission to mosquitos. Our group has a long-standing collaboration with colleagues at the Navrongo Health Research Centre and Noguchi Memorial Institute of Medical Research in Ghana, and the University of Chicago in the US, to conduct longitudinal field-based epidemiological studies of the P. falciparumreservoir in Bongo District, Ghana (Tiedje et al., 2017). Our motivation for this study was to understand P. falciparum transmission dynamics in the context of the roll-out of a malaria control intervention by combining population genetics with more traditional epidemiological and entomological parameters. Our previous research in Bongo District established there was high levels of P. falciparum genetic diversity with no population structure (Ruybal‐Pesántez et al., 2017). We were therefore interested in exploring whether the addition of a short-term indoor residual spraying (IRS) programme against a background of widespread long-lasting insecticidal nets (LLINs) would bottleneck this P. falciparum population in Bongo and lead to reductions in diversity and changes in population structure. 

What difficulties did you run into along the way? One of the major technical limitations in P. falciparum genotyping is phasing multi-genome infections to assign multilocus haplotypes. Eighty per cent of the population of all ages where we work in Ghana have multiple diverse parasite genomes. This is  also a problem for whole genome sequencing of isolates. To get around this problem, we focus on genotyping monoclonal infections using panels of multi-allelic microsatellite markers or biallelic SNPs. In high-transmission settings like our study site in Ghana microsatellite genotyping of P. falciparum provides increased power of inference and higher resolution than biallelic SNPs (Anderson et al., 2000; Ellegren, 2004; Selkoe and Toonen, 2006).

What is the biggest or most surprising innovation highlighted in this study? In our paper, we find that despite the addition of three-rounds of IRS against a background of LLINs between 2013 – 2015, it did not lead to a population bottleneck or dramatic change in parasite genetic diversity. This was striking because IRS did achieve a >90% reduction in local malaria transmission intensity and 37.5% fewer malaria infections in the community. The potential for rebound of P. falciparum transmission is therefore highly likely if these control programmes are not implemented long-term. 

Moving forward, what are the next steps in this area of research? Population genomic approaches are increasingly being applied to enhance our understanding of epidemiology, transmission dynamics, and public health strategies for a variety of pathogens. In the malaria field, the potential of genomic data to guide control and elimination strategies has been recognized but is still in early stages with respect to its translation into general practice. In our paper, we highlight that genomic surveillance is pivotal to assess progress towards achieving the World Health Organisation Global Technical Strategy for Malaria 2016-2030 targets. Along with our collaborators in Ghana, we have conducted follow-up surveys in our study site to track the long-term implications of this IRS intervention, as well as other interventions that have been rolled out across Bongo District since 2015. We are also applying phylodynamic approaches to characterize variant antigen genes to further explore the impact of interventions on P. falciparum adaptation and fitness, as alternate but complementary surveillance metrics in this high-transmission setting. 

Dionne Argyropoulos, co-first author on this paper, is investigating the neutral and adaptive genetic diversity of P. falciparum in these follow-up surveys and in the context of other control interventions as part of her PhD research. Shazia Ruybal-Pesántez, co-first author on this paper, is now currently applying a suite of genomic epidemiology approaches to better understand residual and resurgent malaria transmission dynamics in the Asia-Pacific and Americas regions as part of her post-doctoral research.

What have you learned about methods and resources development over the course of this project? Firstly, it is important that you understand the basic principles of the concepts that you are using. It may seem rudimentary, but these principles will ensure that you are answering the scientific question that you are interested in and are maintaining scientific integrity throughout the research process. Asking for help or support from others in your field is also useful to bounce ideas and enhance your understanding of your research findings. The most exciting part of Molecular Ecology is how we utilise the insights molecular techniques to answer big picture questions. Our study integrated population genetics and genomic surveillance to address key research questions about malaria transmission and control interventions. To do this, we used existing molecular techniques (i.e., microsatellites) in new ways (i.e., to evaluate IRS over time). We also believe that it is important to not be afraid to apply novel techniques to new research questions, such as using bioinformatic tools and various packages in R.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? This project was unique as it involved field sample collection and processing, parasite genotyping, data generation and for the analysis required combining traditional epidemiological methods with population genetics and genomics approaches. When working with large sample sets and datasets, it is critical to pay attention to detail during data generation, curation and downstream analyses. Developing and strengthening coding skills was instrumental in enabling us to execute the necessary analyses of these data. We found R to be an incredibly useful resource to document our analyses and facilitate discussion and interpretation of the data with colleagues, while ensuring reproducibility of our work. We used several well-established R packages for data management and the population genetics analyses. Overall, this multidisciplinary project would not have been possible without being part of a multi-disciplinary team with a wealth of knowledge and the strong collaborations with experienced researchers in Ghana. 

Describe the significance of this research for the general scientific community in one sentence. We show how parasite genetics can be harnessed to better understand the efficacy of malaria control interventions, particularly by identifying key factors leading to parasite resilience that may not be reflected in other commonly used evaluation metrics. 

Describe the significance of this research for your scientific community in one sentence. Short-term indoor residual spraying with insecticides did not cause a dramatic change on the genetic diversity of P. falciparum in Bongo District, Ghana, therefore long-term strategies are necessary to genetically bottleneck the parasite population. 

Argyropoulos DC*, Ruybal-Pesántez S*, Deed SL, Oduro AR, Dadzie SK, Apparu MA, Asoala V, Pascual M, Koram KA, Day KP, Tredje KE. THe impact of indoor residual spraying on Plasmodium falciparum microsatellite variation in an area of high seasonal malaria transmission in Ghana, West Africa. Molecular Ecology. https://doi.org/10.1111/mec.16029. (*joint lead authors)

Joint lead authors Dionne Argyropoulos (left) and Shazia Ruybal-Pésantez (right). Photo Credits: The Stockholm International Youth Science Seminar, Unga Forskare; http://www.ungaforskare.se (left) and The Walter and Eliza Hall Institute of Medical Research; www.wehi.edu.au (right). 

Summary from the authors: Estimating contemporary effective population size

Effective population size (Ne) is crucial parameter in evolutionary biology that reflects the number of individuals in a theoretically ideal population having the same magnitude of loss of genetic variation as the population in question. There are several types of Ne estimates, and they vary in definition and application. For example, contemporary Ne represents the size of a population in the previous generation/s and is a parameter of relevance in many species. Estimating contemporary Ne is, however, difficult and remains in practice often unknown. This is particularly the case for large populations where the amount of drift in the short term is limited. We used genomic data from 85 collared flycatchers of an island population sampled at two time points, and applied several methods to estimate Ne. These methods either compared genetic variation between the two time points (temporal methods) or analyzed variation patterns from a single time point (LD-based methods). The temporal methods estimated Ne at a level of few thousand, while the approach based on LD provided ambiguous estimates associated with high variance. Our results suggest that whole-genome data can help to estimate large contemporary Ne, but temporal sampling seems to be necessary.  

Article: Nadachowska-Brzyska K, Dutoit L, Smeds L, Kardos M, Gustafsson L, Ellegren H. 2021. Genomic inference of contemporary effective population size in a large island population of collared flycatchers (Ficedula albicollis). Molecular Ecology https://doi.org/10.1111/mec.16025.

Interview with the authors: genomic and phenotypic divergence between populations in translocated species

In a recent issue of Molecular Ecology, Taylor et al. explore how between population translocations of a small and endangered freshwater fish may break the long-term evolutionary boundaries between populations in this species. In this study, the researchers used a combination of genomic and phenotypic data to show that translocation efforts, which were necessary for meeting species conservation goals, could alter some important genetic and morphological differences between populations. To read the complete story, see the full article now available online as well as the interview with the authors below.

What led to your interest in this topic / what was the motivation for this study? Some excellent work with microsatellites had previously identified three populations of Bluemask Darters across their small range (Robinson et al. 2013, Cons. Gen.). One population, larger and more genetically diverse than the others, was in the Collins River, in the western portion of the range. A second population was in Rocky River, more central. A third population was in Cane Creek and the Caney Fork to the east. There was also a population in the Calfkiller River, which has been extirpated for several decades. In this context, captive-reared Bluemask Darter progeny from the Collins River population were being introduced to the Calfkiller River. But the location of the Calfkiller, near the center of the range, gave an important quirk to the system. If the three populations were not equally distinct, then Calfkiller River might be better suited with individuals from Rocky River, Cane Creek, or Caney Fork, rather than the western Collins River. In other words, the geography of the system meant that we needed to know the phylogenetic or hierarchical structure of population structure to know what boundaries might be lurking between Collins River and an introduced population in the Calfkiller River.

What difficulties did you run into along the way? One challenge in our project was navigating the connection between our scientific discoveries and the underlying goals of conservation. Our analyses were focused on the quantitative aspects of Bluemask Darters phylogenetics. However, at the end of the day, we are talking about an endangered species, incredibly imperiled, with a tiny range and an uncertain future. No quantitative value can give us strict guidance about the normative problems of conservation. So a challenge was to unpack, as best as we could, how our conclusions about the phylogenetics, population structure, and demography of this species could ultimately help us conserve the multiple diverging lineages of Bluemask Darters. The reviewers and editors from Molecular Ecology helped us refine our logic and our language, and the final result is a paper that acknowledges the complexities and competing concerns of translocation in a system like this.

What is the biggest or most surprising innovation highlighted in this study? One of the most significant findings of this study was the discovery of two divergent clades of Bluemask Darters — precisely the boundary being broken by current conservation management decisions that move fish between clades! One clade includes western individuals and the other includes eastern individuals. To top it off, we had the unique opportunity to use historic morphological data from across the range, including the Calfkiller River site where the fish had been extirpated, and which was now being restored with fish originating from the western population. The consistent result was that eastern sites harbor a distinct population from western sites, and that the Calfkiller River was associated with the eastern population. It is now apparent that translocated individuals should be from a source consistent with the clade that previously occupied the Calfkiller River, and from a source that will not artificially perturb existing evolutionary boundaries. In our study, there are additional complicating factors — the ideal eastern translocation sources are low abundance and not as genetically diverse. So our study was also a new opportunity to address how we might balance multiple concerns, with genetic details, while addressing a complicated conservation issue.

Moving forward, what are the next steps in this area of research? In our paper, we discuss how there are juvenile Bluemask Darters that drift into the reservoir at the center of the range and may not be able to migrate upstream to appropriate habitats needed as adults. These young fish are from the Rocky River, and are part of the appropriate clade for restocking the Calfkiller River. However, the success of this strategy would depend on the population dynamics of young fish in the reservoir. Jeff Simmons, co-author on this paper, and colleagues will be pushing forward with the critical next steps. There will be studies of the density and abundance of juvenile fish in the reservoir, including whether juveniles recruit into a breeding population or simply perish before maturity. There is also ongoing monitoring of the translocated fish in the Calfkiller, and across the species range. All of this work is being combined with habitat quality monitoring aimed at unraveling the location, frequency, and cause(s) of water quality issues that are harming darters in this system. All together, we’re continuing to build a picture of how best to conserve the distinct lineages of Bluemask Darters. 

What have you learned about methods and resources development over the course of this project? Making this project successful meant combining dozens of different analyses — assembling, aligning, and filtering sequences, phylogenetics, population structure, genetic differentiation statistics, demographic simulations, to name a few — each of which have their own traps and idiosyncrasies. Getting these methods working required, first, well, getting everything to run, and then getting everything to run correctly. As useful as online documentation is, I learned there is no substitute for learning with colleagues who are engaging in similar research. Shout out especially to Dan MacGuigan, Daemin Kim, and Ava Ghezelayagh, all students with Tom Near. My conversations with these and other colleagues were critical for avoiding analytical pitfalls. These conversations also spurred ideas about new analyses and perspectives that will continue moving phylogenetic and population genetic work forward. 

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? It’s been said before, but it really was important to have reproducible code for this project. Working with next-generation sequence data meant an enormous number of different files and analysis packages. Being able to switch between versions (like with git), automate programs (like with bash scripts), and manage software environments (like with conda) saved us hundreds of hours. At the end, you can neatly package everything up; all of our data and code, for example, is now stored on a dryad repository that could basically reproduce our paper from scratch in just a few commands. Even after publication, sharing code has also meant starting new conversations with other scientists about best practices, alternate methods, and new ideas for genetic analyses.

Describe the significance of this research for the general scientific community in one sentence. Our study uses genetic and morphological data to unravel how translocation strategies for an endangered freshwater fish might balance the competing conservation concerns of phylogenetic divergence, genetic diversity, and population demography.

Describe the significance of this research for your scientific community in one sentence. Our study identifies two distinct clades of endangered Bluemask Darters across their small range, where current management decisions are translocating individuals across those diverging lineages.

Taylor LU, Benavides E, Simmons JW, Near TJ. 2021. Genomic and phenotypic divergence informs translocation strategies for an endangered freshwater fish. Molecular Ecology. https://onlinelibrary.wiley.com/doi/10.1111/mec.15947.

Summary from the authors: landscape genetics of eastern indigo snakes

Landscape features, such as land use, vegetation cover, roads, and topography, strongly influence genetic connectivity yet these relationships can vary across spatial scales which therefore requires multi-scale approaches for evaluating landscape genetics relationships. We used the federally threatened eastern indigo snake (Drymarchon couperi), a terrestrial habitat generalist endemic to the southeastern United States, as a case study with which to evaluate the consequences of different approaches for accounting for spatial scale when optimizing genetics resistance surfaces using the software ResistanceGA. Resistance surfaces with scale selected using a true optimization approach simultaneously comparing all possible combinations of scale across each set of covariates performed better than resistance surfaces where scale was selected individually for each covariate. Truly optimized resistance surfaces also outperformed resistance surfaces based on habitat selection models and categorical land cover maps. Optimal scales were usually larger than average indigo snake home range sizes suggesting that gene flow was mediated mostly by extra-home range dispersal. Large tracts of undeveloped upland habitat with intermediate habitat heterogeneity most promoted indigo snake gene flow while roads did not appear to restrict gene flow. Our results show the importance of testing a wide range of spatial scales in landscape genetics studies. 

The top-ranked optimized genetic resistance surfaces for eastern indigo snakes in central Florida from (a) categorical land cover surfaces, (b) multi-scale habitat selection models, and (c) multi-scale landscape covariates selected using a true optimization approach. (d) shows an average resistance surface across our best-supported truly optimized resistance surfaces. (Figure 6 from Bauder et al. 2021.)

Article: Bauder JM, Peterman WE, Spear SF, Jenkins CL, Whiteley AR, McGarigal K. 2021. Multiscale assessment of functional connectivity: Landscape genetics of eastern indigo snakes in an anthropogenically fragmented landscape in central Florida. Molecular Ecology https://doi.org/10.1111/mec.15979.

2021 Molecular Ecology Prize

We are soliciting nominations for the annual Molecular Ecology Prize.

The field of molecular ecology is young and inherently interdisciplinary. As a consequence, research in molecular ecology is not currently represented by a single scientific society, so there is no body that actively promotes the discipline or recognizes its pioneers. The editorial board of the journal Molecular Ecology therefore created the Molecular Ecology Prize in order to fill this void, and recognize significant contributions to this area of research. The prize selection committee is independent of the journal and its editorial board.

The prize will go to an outstanding scientist who has made significant contributions to molecular ecology.  These contributions would mostly be scientific, but the door is open for other kinds of contributions that were crucial to the development of the field.  The previous winners are: Godfrey Hewitt, John Avise, Pierre Taberlet, Harry Smith, Terry Burke, Josephine Pemberton, Deborah Charlesworth, Craig Moritz, Laurent Excoffier, Johanna Schmitt, Fred Allendorf, Louis Bernatchez, Nancy Moran, Robin Waples, Scott Edwards, and Victoria Sork.

Please send your nomination with a short supporting statement (no more than 250 words; longer submissions will not be accepted) and the candidate’s CV directly to Scott Edwards (sedwards@fas.harvard.edu) by Friday, April 16, 2021.  Organized campaigns to submit multiple nominations for the same person are not necessary and can be counterproductive.  Also, note that nominations from previous years do not roll over.

Nominations are now open for the Harry Smith Prize 2021

The editorial board recently established a new prize that recognizes the best paper published in Molecular Ecology or Molecular Ecology Resources by early career scholars in the last year by graduate students or early career scholars with no more than five years of postdoctoral or fellowship experience. The prize is named after Professor Harry Smith FRS, who founded Molecular Ecology and served as both Chief and Managing Editor during the journal’s critical early years. He continued as the journal’s Managing Editor until 2008, and he went out of his way to encourage early career scholars. In addition to his editorial work, Harry was one of the world’s foremost researchers in photomorphogenesis, leading to concepts such as “neighbour detection” and “shade avoidance,” which are essential to understanding plant responses to crowding and competition. His research provided an early example of how molecular data could inform ecology, and in 2008 he was awarded the Molecular Ecology Prize that recognized both his scientific and editorial contributions to the field. As with the Molecular Ecology Prize, the winner of this annual prize is selected by an independent award committee, but the Harry Smith Prize comes with a 1,000 USD cash award, an announcement in the journal and on social media, as well as an invitation to join the Molecular Ecology Junior Editorial Board. Please send a short supporting statement (no more than 250 words; longer submissions will not be accepted) and PDF of the paper you are nominating to Dr. Alison Nazareno (nazareno@umich.edu) or Dr. Katrina West (katrina.m.west@postgrad.curtin.edu.au) by Friday 31 May 2021. Self-nominations are accepted. 

Associate Editor vacancies

Molecular Ecology and Molecular Ecology Resources are looking for new Editorial Board members to join the journals as Associate Editors in the key subject areas below:

  • Eco-immunology/emerging diseases/disease resistance
  • Proteomics/protein evolution
  • Computer programs/statistical approaches
  • Environmental DNA/metabarcoding

Experience with genome assemblies would also be advantageous.  

Nominations and applications are welcome and whilst scientific qualifications are paramount, we would particularly appreciate nominations and applications from suitably qualified researchers from underrepresented groups (including women, ethnic minority scientists, scientists with disabilities and other underrepresented groups). Please email nominations/applications by October 15th, 2020 to manager.molecol@wiley.com with the following items:

  • Cover letter stating the reasons for your nomination, of if applying for yourself, your interest in the role and familiarity with the journals,
  • Abbreviated CV (Education, Publications, Outreach) if you have it.

Interview with the authors: How does invasiveness evolve? A look at feral pigs

Understanding how and why some species readily invade new habitats is an interesting view into the myriad ways species evolve. Limiting the expansion of such introduced species can be important for managing ecosystems, particularly when the invasive species is as ecologically destructive and economically costly as the feral swine in the US south. In a paper published recently in Molecular Ecology, researchers led by Dr. Tim Smyser investigated the origins of the invasive feral swine populations to determine how much the expanding footprint of this species was a due of recently escaped domesticated pigs. Surprisingly, they found that the expanding range was largely attributable to range expansion by the established invasive swine population. Read on to for more details from Dr. Smyser into this very interesting work!

Invasive feral swine originated from a combination of European feral pigs and domesticated stock. Photo by Dr. Mirte Bosse, dvdwphotography.

What led to your interest in this topic / what was the motivation for this study? 

Invasive feral swine have expanded rapidly throughout the United States over the past 30 years. The impetus for the study was to identify the drivers for that expansion, to ask: where are new feral swine populations coming from? Prior to our work, there was a hypothesis that domestic pigs had sufficient phenotypic plasticity that they would revert to a wild phenotype, resembling a wild boar, if living in the wild. Under this hypothesis, any pig farm could have served as a viable source population for invasive feral swine. With this study, we revealed that there is very little direct contribution to invasive feral swine populations from domestic pigs, potbellied pigs, or wild boar. Rather, the rapid expansion observed over the past 30 years has been driven by incremental range expansion of established invasive feral swine, which overwhelmingly represent animals of mixed European wild boar-heritage domestic breed ancestry, and long-distance translocation of feral swine from established populations to uninvaded habitats.

What difficulties did you run into along the way? 

The challenges were largely computational. We had amassed over 9,000 genotypes by the time we compiled the reference set and generated genotypes from invasive feral swine genotypes for this study. Such a large dataset required that we do everything we could to optimize runtime efficiency. Even with these efforts, the analysis still took about 4 months of runtime while using 30 CPUs with 60 threads.

What is the biggest or most surprising innovation highlighted in this study? 

I would say the most surprising result was the very high proportion of invasive feral swine that had a significant ancestry association to European wild boar. The historical record suggests wild boar releases have been far more limited than the potential for domestic pig releases, yet 97% of feral swine had significant European wild boar ancestry. This might suggest hybrid wild boar-domestic pig ancestry is biologically important for feral swine to establish self-sustaining populations and become invasive.

Moving forward, what are the next steps in this area of research?

Descending from this work, our next steps are multifaceted. With this analysis, we have identified the drivers of range expansion at a broad-scale with ancestry results pointing to the expansion of established populations. We are now interested in adding a fine-scale understanding of expansion to identify the specific sources of newly emergent populations and map the patterns of feral swine expansion. Also, this analysis has provided an understanding of the ancestral composition of invasive feral swine. Given the hybrid origin of these animals, we will identify elements of the genomes from their ancestral groups, that is heritage breeds of pig and European wild boar, that have been selectively retained in feral swine. By describing selective sweeps relative to ancestral groups, this analysis will allow us to describe the evolution of invasiveness among feral swine.   

What would your message be for students about to start developing or using novel techniques in Molecular Ecology?

The field of Molecular Ecology is changing so quickly that it is hard as a scientist to keep up, from both a computational/statistical standpoint and with all the new molecular techniques and analyses that allow us to dive deeper into the genome than we had previously imagined. My recommendation for students would be to not let the lack of a specific skill deter you from asking interesting questions – take the time to develop the needed skill sets or develop collaborations to facilitate your learning or use of those skills. Also, keep asking questions – don’t be content with the answers we are able to resolve today.

What have you learned about methods and resources development over the course of this project? 

Reflecting back on my answer immediately above, when I started asking the question of what are the drivers of invasive feral swine range expansion, I did not have the data or the skills to meaningfully address that question. Through the development of a great team of collaborators and independent learning, I was able to assemble the needed skills and then the data to pose this question and reveal interesting results. Through this project, I learned about the statistical tools used in the analyses, developed the coding skills necessary to execute those analyses, and identified strategies to maximize computational efficiency as was needed for working with such a large dataset.

Describe the significance of this research for the general scientific community in one sentence.

We have demonstrated that the recent and rapid expansion of feral swine, an ecologically destructive and economically costly invasive species distributed throughout much of the US and the world, has been facilitated by movement (in many cases anthropogenic movement) from established populations to uninvaded habitats as opposed to novel introductions of either domestic pigs or wild boar.

Describe the significance of this research for your scientific community in one sentence.

In identifying the admixed origins of invasive feral swine, descending from heritage domestic pig breeds and European wild boar ancestry, we can begin to gain an understanding of the evolution of invasiveness for this species and invasive species more broadly. 

Feral Swine are not native to the U.S. They are the result of recent and historical (1500’s Spanish explorers) releases of domestic swine and Eurasian boar. USDA APHIS photo Laurie Paulik.

Smyser TJ, Tabak MA, Slootmaker C, Robeson MS, Miller RS, Bosse M, Megens H-H, Groenen MAM, Rezende Paiva S, Assis de Faria D, Blackburn HD, Schmidt BS, Piaggio AJ. 2020. Mixed ancestry from wild and domestic lineages contributes to the rapid expansion of invasive feral swine. Molecular Ecology. https://doi.org/10.1111/mec.15392

Interview with the authors: can we identify the acting selective regime in evolution experiments?

Rapid adaptation to novel conditions is an exciting and growing area in evolutionary research due, at least in part, to our desire to understand the effects of climate change, introduced species, and other conservation-related concerns. However, our ability to detect this evolution is fraught with both biological realities and technical difficulties. A recent paper by Drs. Pfenninger and Foucault, published in Molecular Ecology, illustrate how deep resequencing of replicated experimental populations can fail to provide evolutionary insights, even with extreme selective pressures, due to adaptation to unintentional environmental conditions that overwhelm the genomic signals of the intended selection. This rapid adaptation, in this case to captivity, is an interesting phenomenon that is almost certain to alter other experimental systems, including those that take place in the field. In addition to more details on this fascinating study, the interview with Dr. Pfenninger below also provides an interesting view into technical issues the research team faced: a short time after the initial publication of their manuscript, they discovered a bug in allele frequency calling software that they used! 

Swarming flight of Chironomus midges over a small puddle. Photo credit: Markus Pfenninger.

What led to your interest in this topic / what was the motivation for this study? 

I wanted to know whether rapid adaptation of a natural population to an environmental stressor, in this case temperature, is possible and, if so, by which processes in detail. Apart from this being a fundamental question in population genetics, it is a crucial issue for biodiversity in the ongoing global change.

This is the official and completely true answer – but, to be completely honest, not the entire story: I wanted to see, analyse and prove evolution by natural selection hands-on. Because it’s one thing to teach something gleaned from literature and another to have seen it with your own eyes.

What difficulties did you run into along the way? 

There were actually quite a few: finding a suitable PhD student, technical difficulties with the experimental facilities that almost killed the long term experiment after some months, to mention only the most important ones.

And finally, of course, the almost detrimental issue with a bugged software tool: A few days after the official publication in January, a student reanalysing the data from a different angle, stumbled over unexplainable inconsistencies between the raw data and the allele-frequencies inferred from them. You can imagine the shock it gave me!

When I looked into the problem, I quickly found out that the allele frequencies had little to do with the raw data for most, but perfidiously not all positions in the genome, in particular not the first few on the first scaffold – that’s why the error escaped my attention during a cursory check. The allele frequencies were extracted by a software tool which indeed produced consistently wrong results – a task in principle so simple that systematic checking would have required to write a second tool that exactly did what the first should have done in the first place.

I immediately contacted the authors of the tool and they promptly confirmed that the version we used contained this bug. They did nothing wrong, though. Once they discovered the bug a few months ago, they had promptly updated the tool and documented the error in the release notes. In fact, it appears that the wrong version was on the server for a few days only. Unfortunately it was exactly during the time when we downloaded it – and who looks into the release notes after a tool seemingly did without a hitch what it was supposed to?

I had no choice but to contact the editorial office of Molecular Ecology, informing Genevieve Horn that parts of the publication were flawed and should probably be retracted. At the same time, I started reanalysing the complete data set with a correct version of the tool. Fortunately, after a hard week of number crunching, it turned out that the wrong values were highly correlated in terms of location and allele frequencies to the true values so that some numerical values, but none of the study’s conclusions, needed to be revised. The journal agreed that in this case, a correction article would be sufficient and here it is.

I have to say that everyone, from the software authors to the editor in chief, I have dealt with in this affair has responded greatly and I want to express my deep gratitude here. Given this experience with Molecular Ecology, I can only encourage everyone to address such unfortunate as perhaps unavoidable mistakes immediately and openly.

What is the biggest or most surprising innovation highlighted in this study? 

The rather unsettling major result of the study was the realisation that it is nearly impossible to experimentally manipulate the selection regime of a natural population in a targeted, predictable manner. I think, however, that such “failures” finally advance science by showing which approaches are worth pursuing and which not. Besides this more philosophical aspect showed the study the impressive power of rapid polygenic adaptation.

Moving forward, what are the next steps in this area of research?

I am currently moving into analysing population genomic time series from the field to get an idea on the selective forces acting on natural populations.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology?

Have a good plan, be ready to revise it once the plan meets reality and be prepared for setbacks, remain critical about your results and incorporate appropriate controls. But perhaps most importantly, always take your time to think what you are currently doing and what should be the next steps.

What have you learned about methods and resources development over the course of this project? 

Obviously to even more thoroughly back-check every single analysis. Beyond this, I realised the value and potential of population genomic time series analysis.

Describe the significance of this research for the general scientific community in one sentence.

An evolutionary experiment tells you something about the experiment – not necessarily about nature.

Describe the significance of this research for your scientific community in one sentence.

The acting selective regime in evolutionary experiments is difficult to predict and to manipulate – but perhaps it may be inferred from the results.

Pfenninger M and Foucault Q. 2020. Genomic processes underlying rapid adaptation of a natural Chironomus ripariuspopulation to unintendedly applied experimental selection pressures Molecular Ecology 59:536-548. https://doi.org/10.1111/mec.15347