Interview with the authors: Modelling multilocus selection in an individual‐based, spatially‐explicit landscape genetics framework

Genetic variation in natural systems is complex and affected by a variety of processes, and this reality has contributed to the growing popularity of simulation-based approaches that can help researchers understand the processes acting in their systems. Despite the flexibility of simulation-based approaches, simulations of natural selection across a heterogeneous landscape have typically been limited to one or two loci (e.g. Landguth, Cushman, & Johnson, 2012). In a recent issue of Molecular Ecology Resources, Landguth et al. introduce an approach to model multilocus selection in a spatially-explicit, individual-based framework, implemented in the programs CDPOP and CDMetaPOP. Read the interview with lead author Erin Landguth below to learn about the challenges in developing this program, the potential of this approach to help understand complex genotype-environment associations, and the benefits of working with strong multidisciplinary team! Read the full article here.

Dr. Erin Landguth coding in CDPOP.

What led to your interest in this topic / what was the motivation for this study? 

Over the last two decades, there has been an exponential increase in landscape genetic studies, and still, the methodology and underlying theory of the field are under rapid and constant development. Furthermore, interest in simulating multilocus selection, including the ability to model more complex and realistic multivariate environmental scenarios, has been driven by the growing number of empirical genomic data sets derived from next-generation sequencing. We believe many of the major questions in landscape genetics require the development and application of sophisticated simulation tools to explore the interaction of gene flow, genetic drift, mutation, and natural selection in landscapes with a wide range of spatial and temporal complexities. Our interests lie in developing such tools and providing more flexible models that are linked to theory, and that better represent complex genetic variation in real systems. For example, adaptive traits often have a complex genetic basis that interacts with selection strength, gene flow, drift, and mutation rate in a multivariate environmental context; and this module provides the ability to simulate these processes across many adaptive and neutral loci in a landscape genetic context.

What difficulties did you run into along the way? 

When developing new modules for existing software packages, my first and primary goal is to validate these modules to theory where possible. This can take some time and many decisions, questions, and trial and errors come up along the way through this very important validation process. For multilocus selection, our validation process was to match simulation output with the theoretical expected change in allele frequencies for selection models developed by Sewall Wright in 1935. If the module is placed in the wrong location in the simulation workflow (i.e., timing) or if all of the Wright-Fisher assumptions are not matched exactly, then the simulation output will not match theoretical expectations. However, once all of these pieces are lined up, there is definitely a eureka moment, and I am then confident in the module’s performance for more complex scenarios where we will not be able to evaluate against theoretical expectations.

What is the biggest or most surprising innovation highlighted in this study? 

Multivariate environmental selection can produce complex landscape genetic patterns, even when only a few adaptive loci are involved. The relatively simple “complex” example simulated in the paper illustrates how complicated the underlying relationships can be between allele frequencies and environmental conditions. Simulating these complex relationships will be essential for testing genotype-environment association methods in a more rigorous fashion than has been seen so far. Additionally, the ability to simulate realistic landscape genetic scenarios that reflect the environmental complexity of actual landscapes will be important for validating findings from empirical data sets. 

A picture containing building

Description automatically generated
Outcome for simulation of a complex landscape and three loci. The three selection landscapes (Figure 1 of Landguth et al., 2020) are superimposed with lighter‐white areas referring to areas where all three landscapes have values of 1 and darker areas mean all three landscapes have values of −1. The copies (either 2, 1, or 0) of the first allele for each of the three loci are plotted, where darker green genotypes have more copies of these alleles (e.g., 2, 2, 2 corresponds to 2 copies of the first allele for the first, second and third loci, respectively). The first locus is associated with the categorical landscape (X1‐Figure 1a of Landguth et al., 2020). The second locus is associated with the gradient landscape (X2‐Figure 1b of Landguth et al., 2020). The third locus is associated with the habitat fragmented landscape (X3‐Figure 1c of Landguth et al., 2020).

Moving forward, what are the next steps in this area of research?

Epigenetics! We of course have a number of applications in progress for this current module, but we have already started beta testing our next module for simulating epigenetic processes in landscape genetics.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? 

Starting a simulation study in landscape genetics for the first time can be daunting and intimidating. Fear not, we say! As with all software packages, there will be a learning curve, but if you persevere and get past the first few hurdles (e.g., learning the ins and outs of file formats, running the program in a potentially unfamiliar programming interface), the door will be opened to unlimited questions that can be addressed with simulations in your system. Additionally, just like any other field study or experiment, simulation modeling is most informative when coupled with specific questions and hypotheses and well-thought-out study designs.

What have you learned about methods and resources development over the course of this project? 

As we begin to add more complex modules to these simulation platforms, I am increasingly relying on multidisciplinary approaches and teams. For example, development of this current module required Brenna Forester for her expertise in landscape ecology and genotype-by-environment concepts, as well as Andrew Eckert, with his in-depth knowledge of population genetics theory, particularly the history of additive vs. multiplicative models for fitness.

Dr. Brenna Forester, post-doctoral researcher at Colorado State University and recently awarded David H. Smith Conservation Research Fellow, helped integrate key genotype-by-environment concepts into the new module.

Describe the significance of this research for the general scientific community in one sentence.

We have implemented a new module into the landscape genetic simulation programs CDPOP and CDMetaPOP that allows realistic multivariate environmental gradients to drive selection in a multilocus, individual-based, landscape genetic framework.

Describe the significance of this research for your scientific community in one sentence.

This new simulation module provides a valuable addition to the study of landscape genetics, allowing for explicit evaluation of the contributions and interactions between demography, gene flow, and selection-driven processes across multilocus genetic architectures and complex, multivariate environmental and landscape conditions.

References

Landguth EL, Forester BR, Eckert AJ, et al. (2020). Modelling multilocus selection in an individual-based, spatially-explicit landscape genetics framework. Molecular Ecology Resources, 20, 605–615. https://doi.org/10.1111/1755-0998.13121

Landguth, E. L., Cushman, S. A., & Johnson, N. A. (2012). Simulating natural selection in landscape genetics. Molecular Ecology Resources, 12, 363– 368. https://doi.org/10.1111/j.1755-0998.2011.03075.x

Wright, S. (1935). Evolution in populations in approximate equilibrium. Journal of Genetics, 30, 257– 266. https://doi.org/10.1007/BF02982240

Interview with the authors: barriers to fox gene flow in urban and rural settings

In an article published recently in the latest issue of Molecular Ecology, researchers from Researchers from the Leibniz Institute for Zoo and Wildlife Research and the Luxembourg National Museum of Natural History investigated differences between urban and rural red fox populations. They found that physical barriers in both habitats, such as a river or road, limited fox movement, and also that human activities influenced where foxes moved. This is important because it means that the interaction between human activity and other structures on the landscape may negatively alter the fox populations. For more information, please see the full article and the interview with lead author Sophia Kimmig below. 

A red fox (Vulpes vulpes) moving along rail roads in the city centre of Berlin, Germany. © Jon A. Juarez.

What led to your interest in this topic / what was the motivation for this study? Human population growth and land use are altering ecosystems worldwide and although continuing urbanization results in dramatic environmental changes, some species seem to cope well with the anthropogenic pressure. Foxes are distributed over the entire metropolitan area of Berlin, therefore it is usually assumed that they cope well with human presence. However, city life can affect key aspects of wildlife ecology and have substantial impact on the movement ecology and dispersal ability of populations. Dispersal in urban areas may be influenced by physical barriers, but also by behavioural barriers that we cannot directly see. Thus species that are physically capable of crossing the urban matrix may nevertheless face behavioural barriers due to avoidance of man-made objects (with their artificial structure, scents etc.) as well as human presence per se. We therefore wanted to understand how the landscape influences gene flow patterns in red foxes across the urban-rural matrix.

What difficulties did you run into along the way? With an increasing number of population genetic clustering approaches and R packages that differ in their precise working mechanisms, it becomes more challenging to interpret diverging results and recognize biological patterns. Further, the promising and fascinating possibilities of modelling gene flow through the landscape also come with uncertainties in how to deal with certain circumstances or type of data. For example, we discuss the issue of dealing with overlapping landscape features in the studied environment i.e. linear landscape elements (such as roads or rivers) that cross a surface structured landscape element (e.g. a forest or park). Especially in urban areas, the habitat has such a high level of complexity that you could easily spend years modelling and testing different land use layers.

What is the biggest or most surprising innovation highlighted in this study? Regarding the fox in the Berlin Metropolitan area: Foxes are quite common in urban areas, so we presumed that there would be few dispersal barriers in the urban environment. Our results have nevertheless shown that foxes disperse preferentially along linear landscape elements such as motorways and railway lines despite the inherent mortality risk. We interpreted this to mean that even urban foxes avoid the presence of humans if possible. 
Regarding a broader, biological perspective: Although we have to further improve our methods (for our study, for example, by including data on population densities, road traffic or other proxies of human presence and activity), it is fascinating that molecular genetic methods may enable us to answer more questions in behavioural ecology in the future. 

Moving forward, what are the next steps in this area of research? Now that we have familiarised ourselves with the landscape genetic techniques, we are looking forward to applying the approaches to a broad range of taxa to better understand how animals move through the landscape. This is not just of academic interest, but may help to identify and protect dispersal corridors for endangered species in a scientifically robust way.
For the Berlin foxes we are going to analyse data from a radio tracking study and research their movement patterns and space use – it will be interesting to compare those results with the ones from landscape genetics. We are looking forward to hopefully adding some more pieces to the puzzle of the city as a wildlife habitat.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? From a beginners’ perspective: For our project, we greatly benefitted from the exchange with other researchers working in this field. For example, we contacted William Peterman, who created the ResistanceGA package that we used for our landscape resistance analysis, with some methodological questions and he provided very helpful advice. I would therefore recommend getting in touch with people who work with the same methods and discussing your ideas and obstacles. Also, our work greatly benefitted from the thorough review process that it underwent. Although the requested changes and suggestions sometimes may come with a lot of re-thinking and -working effort and we usually do not always agree with every single given comment, it is crucial to take constructive criticism to improve our scientific work.

What have you learned about methods and resources development over the course of this project? Molecular genetic methods and the inherent potential to study complex ecological contexts have been changing a lot in the last decades. This is a field of frequent on-going development and improvement. Especially regarding the analytical methods, for an ecologist it is difficult to keep on track with all the latest approaches. Also, due to big data involved in the landscape analysis and the resulting time for computational analysis, the computational effort for a model becomes a real issue in landscape genetics. It is really a pity when more thorough analysis are theoretically possible and even free data is available but the analysis can just not be conducted in a feasible amount of time.

Describe the significance of this research for the general scientific community in one sentence. Assessing the impact of the habitat on (urban) wildlife beyond the physical properties of the landscape may help us to more deeply understand dispersal, behaviour and population genetic structure of populations.

Describe the significance of this research for your scientific community in one sentence. Methodological advancement due to more in depth comparisons of different genetic measures used in resistance modelling.

Kimmig ES, Behinde J, Brandt M, Schleimer A, Kramer-Schadt S, Hofer H, Börner K, Schulze C, Wittstatt U, Heddergott M, Halczok T, Staubach C, Frantz A. 2020. Beyond the landscape: resistance modeling infers physical and behavioral gene flow barriers to a mobile carnivore across a metropolitan area. Molecular Ecology. https://doi.org/10.1111/mec.15345.

Interview with the authors: historical barriers to gene flow in a fragmenting landscape

In a recent issue of Molecular Ecology, Drs. Maigret, Cox, and Weisrock published their work focused on copperhead snake response to habitat fragmentation. Interestingly, these researchers detected population structure putatively resulting from a historically important highway, even though most traffic has been shuttled to an alternative route for the last 50 years. Understanding the complexities of movement patterns in response to barriers is of increasing importance as our landscape becomes more and more fragmented. For more information, please see the full article and the interview with Dr. Maigret below. 

What led to your interest in this topic / what was the motivation for this study? The immense and rapid shift from forest to barren land and grassland which accompanies surface mining in central Appalachia is striking, especially when viewed from the air. Upwards of 20% of the land surface of some counties has been mined since 1980 through a process often termed “mountaintop removal”. The lack of research on the implications of this fragmentation was curious to me: why had such a major driver of forest loss garnered so little attention? Moreover, if we use next-generation sequencing, could we detect any effects of this land-use change on wildlife populations? It seemed like a nice natural experiment waiting to be investigated.

What difficulties did you run into along the way? Fieldwork was challenging: on top of the issues one deals with when trying to capture large numbers of secretive venomous snakes, nearly all the land in our study area is privately held, and thus gaining access to properties to collect tissue samples was time consuming. In terms of generating our data, obtaining enough DNA from our tissues (mainly scale clips) proved to be a challenge, though DNA quality was fortunately not an issue. Finally, given the diverse array of methods and subsampling protocols we used, optimizing our software pipeline took a little extra time. Thankfully, our university’s computing resources – including our associated staff and faculty – were more than adequate for the task at hand.

What is the biggest or most surprising innovation highlighted in this study? We found no evidence for an effect of mining or the current array of high-traffic roads on genetic differentiation; both of these features were hypothesized to be barriers to movement. But the most surprising part was what we did detect: a break in population similarity spatially coinciding with the path of a road which was a major highway for most of the 20th century. Previous research has suggested that highways can cleave populations of herpetofauna, and modeling work has suggested that these effects could persist for many years. We seem to have found evidence for a combination of these hypotheses, and subsampling suggested that we could have come to a similar conclusion with fewer markers and more missing data.

Moving forward, what are the next steps in this area of research? It will be interesting to see what unfolds as more genomic data is integrated into landscape genetics studies, and especially in landscapes with putative barriers of different ages or permeabilities. Re-analysis of existing data sets using (possibly) more sensitive methods, like the spatially-informed methods we used, might reveal barriers where none were detected using other approaches. As for surface coal mining, more study of the consequences of forest fragmentation – ideally, using species which might be more sensitive – could be very informative.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? Try to keep abreast of the new programs coming out. It seems like every month new approaches are being developed, and while the deluge of methods can be overwhelming at times, employing an assortment of different approaches can help enlighten one’s interpretation of genomic patterns.

What have you learned about methods and resources development over the course of this project? I’ve learned about the importance of integrating methods within an ecological framework. While a new method for analyzing genomic data is usually developed to fill a particular analytical gap, translating that goal into an ecological framework can make the method much more accessible to a broader range of researchers. And in general, doing one’s best to stay on top of the new methods coming online is important, if a little overwhelming at times.

Describe the significance of this research for the general scientific community in one sentence. Our results seem to suggest that the genomic legacy of human settlements and infrastructure can persist in wildlife populations beyond the lifespan of the infrastructure itself.

Describe the significance of this research for your scientific community in one sentence. With genomic data and statistical approaches that integrate spatial information, it might be possible to detect relatively weak genetic structuring in wild populations, and it may not require large amounts of the highest-quality data.

Maigret TA, Cox JJ, Weisrock DW. 2020. A spatial genomic approach identifies time lags and historical barriers to gene flow in a rapidly fragmenting Appalachian landscape. Molecular Ecology. https://doi.org/10.1111/mec.15362.

Interview with the authors: utilising GT‐seq for minimally invasive DNA samples

Minimally-invasive sampling is commonly used to obtain samples from rare, elusive or dangerous animals. However, this sampling technique often results in samples that are too low in quality or quantity for successful use with most high-throughput sequencing methods. Using cloacal swabs from the threatened Western Rattlesnake (Crotalus oreganus), Danielle Schmidt and colleagues show that Genotyping-in-Thousands by sequencing (GT-seq) can successfully be used to generate high-throughput sequence data from low-quality, low-quantity samples. We interviewed Danielle Schmidt (first author) and Professor Michael Russello (last author) to find out more about what went on behind-the-scenes of this study.

The Western Rattlesnake (Crotalus oreganus), a threatened species in British Columbia, Canada. Photo credit: Marcus Atkins

What led to your interest in this topic / what was the motivation for this study? 

Conservation genomics has become an increasingly common term in the literature, yet many study systems that involve elusive or at-risk species must rely on minimally- or non-invasive sampling to meet research and management objectives. Although a valuable source of biological material, DNA extracted from minimally- or non-invasive samples is typically of low quantity, poor quality, and contaminated with exogenous DNA, all of which may be incompatible with modern sequencing technologies. Implementing leading-edge genetic and genomic tools to study conservation-related questions has been a long-standing interest in the Russello Lab.

What difficulties did you run into along the way?

Based on earlier work that came out of our lab (Russello et al. 2015 PeerJ), we suspected that employing a non-targeted sequencing approach like RADseq would not be efficient for collecting genotypic data from minimally-invasive samples. Therefore, we decided to test the efficacy of GT-seq (Campbell et al., 2015), as it is a targeted method that could help circumvent the typical issues involved with sequencing and genotyping lower quality DNA. Our biggest challenge was designing a GT-seq SNP panel that minimized ascertainment bias to ensure our downstream estimates of within- and among-population variation would be accurate. Also, given the number of samples and loci we planned to analyze simultaneously, optimizing the workflow for data collection took some time.

Library designs for A) RADseq and B) GT-seq. Included samples selected to facilitate within- and among-method genotype comparisons

What is the biggest or most surprising finding from this study? 

One of the most surprising findings was the exceptionally high genotype consistency between paired blood and cloacal swab samples genotyped with GT-seq, and those blood samples genotyped with both RADseq and GT-seq. We even found that samples with initial concentrations as low as ~0.5 ng/uL successfully amplified, which is promising for future applications of GT-seq with minimally- and non-invasive DNA samples.

Moving forward, what are the next steps for this research? 

We are now exploring the application of GT-seq on a host of species to provide rapid, cost-effective genetic information to support research in molecular ecology and to assist wildlife and fisheries management. We are also testing the performance of this workflow with other non-invasive sample types, including feces and hair. Moving forward, we will be exploring ways of deploying these tools in the field to inform management decisions in real-time.

What would your message be for students about to start their first research projects in this topic?

An important message we would like to convey is to think carefully about potential biases when designing a panel of markers to target, as the composition of your panel must be tailored to your research questions. For example, some applications of GT-seq may seek to intentionally maximize the among-population component of genetic variation in order to identify individuals of unknown origin to a particular fish stock with high confidence. In other cases, as with our study, we wanted a panel that could be used to most accurately reconstruct population structure and connectivity, which we were able to subsequently validate relative to a larger RADseq dataset.

What have you learned about science over the course of this project? 

This project highlighted the benefits of taking a new approach to address a long-standing challenge. In molecular ecology and conservation genetic studies, minimally-invasive sampling is commonly employed as either a required or a preferential approach for obtaining sufficient sample sizes. Yet, it has been recognized since the advent of non-invasive genetic sampling in the 1990’s that issues associated with DNA quality and quantity require careful consideration and extra quality control steps. Today, these considerations also apply to the use of modern DNA sequencing technologies from suboptimal starting material; however, GT-seq provides a versatile approach for overcoming DNA quality issues and providing the population-level data needed to address research and management objectives.

Describe the significance of this research for the general scientific community in one sentence.

Multiplexed, amplicon DNA sequencing, such as that employed in GT-seq, is compatible with the minimally-invasive sampling often required for obtaining population-level data to inform biodiversity conservation.

Describe the significance of this research for your scientific community in one sentence.

GT‐seq offers an effective approach for genotyping minimally-invasive samples, providing accurate and precise estimates of within‐ and among‐population diversity metrics relative to genome-wide approaches such as RAD-seq.

Read the full study here:
Schmidt, Danielle A., et al. “Genotyping‐in‐Thousands by sequencing (GT‐seq) panel development and application to minimally invasive DNA samples to support studies in molecular ecology.” Molecular ecology resources (2020). https://doi.org/10.1111/1755-0998.13090

Interview with the author: Using host transcriptomics to sample blood parasites

Hosts offer diverse habitat for an incredibly rich array of microbial groups. Genomic resources for many groups residing within hosts (‘infra-communities’) are poor often due to the difficulty in isolating the DNA from the microbe from that of the host, particularly for species living within host cells. In this interview we go behind the scenes with Spencer Galen as he guides us through his transcriptomic approach he developed with colleagues to sample blood parasites such as malaria. Given how ubiquitous and important these parasites can be for animal health, this resource has the potential to pave the way for important advances in disease ecology. Read the paper here.

Avian blood transcriptomes revealed that hosts often have far more complex parasite communities than traditionally thought. For instance, the transcriptome of this Baltimore oriole (Icterus galbula) revealed at least six malaria parasite infections from three malaria parasite genera. The blood smear image from this bird shows the three genera in close contact within the host bloodstream. L: Leucocytozoon, PL: Plasmodium, PA: Parahaemoproteus.
Credit: Spencer Galen

What led to your interest in this topic / what was the motivation for this study? 

This study began with two classic ingredients of scientific discovery: a lot of frustration mixed with a bit of inspiration from other researchers. The frustration was born from a lack of available genetic resources for malaria parasites and other blood parasites, which I felt was hindering the kind of research that I wanted to do. The inspiration came during the first year of my PhD, when several papers were published within a span of just a few months showing that researchers were passively generating large quantities of blood parasite genomic data by sequencing the transcriptomes of their vertebrate hosts. My PhD advisor Susan Perkins and I thought that designing a study to explore this approach in more detail could solve some of my frustrations and help the field of blood parasite research at large.

What difficulties did you run into along the way? 

When we started this project there was always the looming possibility that we would sequence a number of host transcriptomes that were infected with blood parasites and simply not recover any useful parasite data. Even a small-scale transcriptomic project is not a trivial matter financially, and so I will admit that I lost some sleep wondering if this project was a bad idea. Fortunately, field and lab work went quite smoothly, and the results of my first scan for parasites within our initial test transcriptomes exceeded my wildest expectations. And so in reality the biggest challenge was my own self-doubt – if I had paid too much attention to those thoughts, this project might not have gotten off the ground.

What is the biggest or most surprising innovation highlighted in this study? 

We were astounded by just how prevalent blood parasite transcripts can be within host transcriptomes. For instance, in one bird (Vireo plumbeus sampled in the mountains of New Mexico) we found that nearly 17% of all contigs generated from the initial Trinity assembly were derived from a parasite that was infecting just 0.75% of all blood cells. A second surprising finding was the degree to which many of the birds that we sampled were infected with complex communities of parasites that we did not detect using traditional microscopic and DNA barcoding methods. Across all samples we found that transcriptomes revealed about ~20% more infections than the methods that are typically used to study these parasites. This included one individual bird that was infected by three different genera and at least six species of malaria parasite.

Moving forward, what are the next steps in this area of research?

While it is exciting to find that a transcriptomic approach can improve our ability to study the genomic diversity and abundance of wildlife blood parasites, it still remains a rather inefficient approach – at the end of the day, the majority of transcripts from each sample came from the host organism that was not the focus of our study. The next step will be to apply single-cell and other advanced RNA sequencing techniques that have successfully been applied to model systems to provide greater resolution to studies of blood parasite gene expression and host-parasite interactions.   

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? 

At risk of sounding overly pessimistic, be prepared for things to fail the first time around and have a plan B in place. It is wonderful to have a lot of confidence, but pessimism does tend to favor preparedness. Small actions within this frame of mind can save you a lot of grief in the long run, and can be as simple as testing a new method on a sample that isn’t important before you start your project or taking the time to visit a lab to learn a technique before you try it yourself. I naturally assume everything I try in the lab will fail, so each time things work (and they actually often do!) it is a pleasant surprise.

What have you learned about methods and resources development over the course of this project? 

I think that there is a difference between producing a resource, and producing a resource that is easily accessible to the broader research community in practice. As a result, I spent a lot of time thinking about how my colleagues would most directly benefit from the data that we had generated. In the end we made the data from this study available in as many formats as we thought might be useful to other researchers (raw sequences, assemblies from before and after parasite identification, curated alignments, DNA barcodes, etc.). The amount of time that it took to prepare these datasets was extremely small relative to the length of the entire project, and I think will go a long way towards making these data as useful as possible.

Describe the significance of this research for the general scientific community in one sentence.

This study improves our ability to research the ecology and evolution of wildlife blood parasites, a cosmopolitan and ubiquitous group that is widely relevant to global health.

Describe the significance of this research for your scientific community in one sentence.

The methodological framework that we present in this study profoundly improves the genomic resource base that is available to research understudied blood pathogens of wildlife, as well as better detect multi-species parasite communities within hosts.

Interview with the authors: response to amphibian-killing fungus is altered by temperature

Recently, Drs. Ellison, Zamudio, Lips, and Muletz-Wolz published their work focused on some of the ways amphibians respond to an infection by Batrachochytrium dendrobatidis (Bd). Bd is a fungus that is causing devastating worldwide decline of amphibians, meaning that understanding how some species manage the infection is important for conservation of myriad species. Using an elegant experimental set up and subsequent RNA sequencing data, Dr. Ellison and co-authors suggests that the variation in amphibian susceptibility to the fungus, which is related to temperature, occurs due concurrent temperature-dependent shifts in immune system function; lower temperatures were associated with an inflammatory response while higher temperatures with an adaptive immune response. Understanding exactly how and when this fungus alters wild amphibian populations is important for conservation of these often imperiled species. For more information, please see the full article and the interview with Dr. Ellison below. 

Eastern red-backed salamander (Plethodon cinereus). Photo credit: Alberto Lopez.

What led to your interest in this topic / what was the motivation for this study? I have always been fascinated with how parasites and pathogens influence fitness and shape host populations, particularly generalists infecting a wide range of host species. The pathogenic chytrid, Batrachochytrium dendrobatidis (Bd), is arguably one of the most generalist pathogens known to science, capable of infecting hundreds of amphibian species globally. However, even within a single host species, disease outcome (e.g. succumbing to or clearing infection) is highly variable and is often temperature-dependent. Given the devastating impacts Bd has already had on amphibian populations, the recent discovery of another amphibian-killing chytrid (B. salamandivorans), and the ever-pressing threat of climate change, we were driven to uncover how amphibian gene expression responses to chytrid infections vary under different temperatures.

What difficulties did you run into along the way? For me, it was the sheer scale of the sequencing dataset. Plethodon salamanders, notorious for their large genome sizes, had yet to have a published genome or transcriptome to use as a reference for RNAseq studies such as ours. Therefore, we had to ensure sufficient sequencing to de novo assemble the transcriptome, and enough per-sample depth to capture potentially subtle but important changes in gene expression due to temperature and infection. With multiple temperature treatments and multiple disease outcomes at each temperature, this resulted in relatively large RNAseq dataset of over 2 billion reads. Thankfully, having returned to Wales from the US by the time we received our sequence data, I had access to Supercomputing Wales, a nationwide high-powered computing initiative that allowed me to handle the computationally intensive analyses. More importantly, without the hard work of the other authors to carefully design and execute the highly-controlled animal experiments to generate the tissue samples, this study would simply not be possible.

What is the biggest or most surprising innovation highlighted in this study? I think that, within a relatively narrow thermal range, the substantial shifts in the types of immune genes being expressed in response to infection is really important to our understanding chytrid infection dynamics. The finding that adaptive immune transcripts (particularly those involved in MHC pathways) are more highly expressed at warmer temperatures – where amphibians tend to survive infection better – is most exciting. Given the growing evidence for the importance of certain MHC allele variants in Bd resistance, our results suggest it is not only be what MHC genotype amphibians possess, but how they express them during infection that dictates survival.

Moving forward, what are the next steps in this area of research? This study, while providing new insights into how temperature influences Bd-amphibian interactions, has generated many further questions. Some of the authors on this study have recently shown both temperature and Bd has a significant impact amphibian skin microbiome communities, a potentially critical line of defense against infections. It is currently unknown whether temperature-dependent host immune expression responses to Bd shapes skin microbiomes during infection or if skin bacteria are influencing host responses (or a combination of both). Work to directly assess host gene expression under different microbial community compositions would be an exciting future avenue of research. In addition, further investigation of both MHC genotype and expression phenotype simultaneously could be highly relevant to understanding intraspecific variation in chytrid resistance. Finally, we have previously developed methods to quantify Bd gene expression in vivo; it would be fascinating to couple our current findings with how Bd genes are expressed in-host under different temperatures.

Dr Carly Muletz-Wolz field sampling. Photo credit: Karen Lips.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? Many others on this blog have already highlighted the importance of well thought out experimental designs, and the need to grips with the theory before embarking on a project, that I can only echo. Although having now worked on many transcriptomic datasets in non-model organisms, I still sometimes get overwhelmed with the amount of information that could be potentially conveyed in a manuscript, particularly with more complex experimental designs such as this study. I recommend periodically taking a step back from your analyses, share it with colleagues to gauge the most important “headline” results, and finally, don’t worry that some things have to go as supplemental material; they can still be gems of information that kick-off an exciting new line of inquiry for someone!

What have you learned about methods and resources development over the course of this project? With high-throughput sequencing methods becoming ever more accessible and the explosion of innovative ways to analyse and present NGS data, it is all too easy to feel your project is not “cutting-edge” enough. It’s all very well having billions of sequences and a slick set of figures, but a research team most importantly needs to be able to provide meaningful biological/ecological interpretation. That’s why it has been great to be part of a collaborative team of amphibian ecologists and geneticists, which was critical to the development of this new resource of information on salamander transcriptomic responses to temperature and infection.

Describe the significance of this research for the general scientific community in one sentence. The thermally-altered transcriptional responses of salamanders to fungal pathogen infection is an important component to understanding observed seasonal and climatic patterns of chytrid disease outbreaks. 

Describe the significance of this research for your scientific community in one sentence. Our results suggest shifts from inflammatory to adaptive immune gene expression responses to Bd infection at warmer temperatures are a key component to thermal and/or seasonal patterns of amphibian chytridiomycosis.

Eastern red-backed salamander (Plethodon cinereus). Photo credit: Dr Carly Muletz-Wolz.

Ellison A, Zamudio K, Lips K. Muletz-Wolz C. 2019. Temperature-mediated shifts in salamander transcriptomic responses to the amphibian-killing fungus. Molecular Ecology 28:50586-5102.

Interview with the author: Creating the SPIKEPIPE metagenomic pipeline

Reliable abundance estimates is a significant challenge for eDNA metagenomic studies. One important issue is that sequencing introduces multiple sources of noise that can significantly alter the accuracy of abundance estimates. Here we interview Douglas Yu, a professor at the University of East Anglia, about the SPIKEPIPE pipeline recently published in Molecular Ecology Resources. This method is particularly exciting as it can use either short read barcodes or mitogenome data to estimate species abundances by accounting for sequencing noise using correction factors. They test this eDNA pipeline on arthropod samples taken from the High Arctic in Greenland and show that this approach can produce remarkably accurate species abundance estimates compared to samples of known composition. Read the full article here and get the code to run this pipeline here.

image
The 5 steps of SPIKEPIPE.

What led to your interest in this topic / what was the motivation for this study? 

We very much want to know how a heating climate is affecting biodiversity. Greenland is a direct window into this, both because heating has progressed very fast here, and because local species richness is manageable for study:  375 known aboveground arthropod species at the Zackenberg research station. Equally important, the Danish research station at Zackenberg had had the foresight to systematically collect arthropods starting in 1996, and those samples were sitting in ethanol in a warehouse in Denmark. The main obstacle to using them had been that no one could identify the hundreds of thousands of individuals to species level. Luckily, Helena Wirta and Tomas Roslin had in parallel carried out a DNA barcoding campaign at Zackenberg. Put together, we had in our hands a complete time series of community dynamics over a stretch of time during which summer had almost doubled in length. 

What difficulties did you run into along the way? 

When we started, we were all set to use metabarcoding. However, we soon learned (not surprisingly) that the sample-handling protocols had not been designed with molecular methods in mind:  the trap water was reused across time periods, the collecting net was used across traps, and the sorting trays were not bleached between samples. We thus needed a protocol that would be robust to cross-sample contamination and would ideally return quantitative information, since we wanted to detect change in population dynamics. This is why we turned to mitochondrial metagenomics (Tang et al. 2015, Crampton-Platt et al. 2016) and came up with SPIKEPIPE, which combines read-mapping, a percent-coverage detection threshold, and a spike-in to correct for pipeline stochasticity. 

What is the biggest or most surprising innovation highlighted in this study? 

The individual elements of SPIKEPIPE were reasonably well known, but what we hadn’t anticipated is just how accurate the results were when combined in a single pipeline. With mock samples, we found no false-positive species detections (when the percent-coverage threshold is applied) and recovered highly accurate estimates of intraspecific abundances (in terms of DNA mass). With resequenced environmental samples, we found high repeatability of abundance estimates across sample repeats, even though DNA extraction and Illumina library prep, sequencing, and base-calling all inject stochasticity into datafile sizes.

Also very gratifying was finding that SPIKEPIPE returned useful data even when mapping reads only to short DNA barcodes, as originally presaged by Xin et al. (2013). This means that we can make use of the existing vast DNA-barcode reference library.

Moving forward, what are the next steps in this area of research?

SPIKEPIPE is of course only the means to an end, and our next goal is the statistical analysis of community change in a rapidly heating ecosystem. Nerea Abrego and Otso Ovaskainen are now applying joint species distribution modelling (with the R package Hmsc, Tikhonov et al. 2019) to the dataset of 712 pitfall-trap samples. One important question is to quantify how much of the year-to-year variation in species abundances can be attributed to species interactions, as opposed to climate variables. 

More broadly, the result that SPIKEPIPE can be used with DNA barcodes makes possible an intriguing strategy:   one may now generate both the species reference database and the sample-by-species table from the same set of samples. We are using Greenfield et al.’s (2019) Kelpie software to carry out targeted assembly of DNA barcodes from shotgun-sequenced bulk samples, which we compile into a single DNA-barcode reference database, against which we then map reads from each sample to generate the data table. 

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? 

Build in a lot of testing:  multiple, complex mock samples for pipeline development, repeat environmental samples to measure repeatability, realistically complex positive controls, many negative controls, and many sanity checks as you work through your bioinformatic code. 

You are likely to be learning to code at the same time that you write your first pipelines. Take the extra time *now* to learn and apply robust coding techniques, even if there are easier but less robust methods available. 

Read Jenny Bryan’s tutorial on file naming:  https://speakerdeck.com/jennybc/how-to-name-files

What have you learned about methods and resources development over the course of this project? 

A great way to inspire new methods is to talk with non-molecular researchers about their scientific questions, currently used methods, and available sample types. Our team includes arctic ecologists, molecular ecologists, and a mathematician.

For one’s method to have impact, it will need to be useful for years after one first thinks of it. Stay up to date with technology trends, including costs, to avoid rapid obsolescence.

Describe the significance of this research for the general scientific community in one sentence.

We can use DNA sequencing to quantify how insect and spider communities respond to environmental change.

Describe the significance of this research for your scientific community in one sentence.

Mitochondrial metagenomics is a viable alternative to amplicon sequencing for characterising arthropod communities.