The story behind the Special Feature: Genomics of natural history collections

We are really excited to get a sneak peak into the story behind a new Special Feature in Molecular Ecology Resources focusing on the use of genomic techniques to better understand natural history collections. In this Special Feature, the authors led by Assistant Professor Lua Lopez, compiled a broad range of studies using a variety of methods to illustrate the enormous potential of museum samples to answer question fundamental to molecular ecology. See below for a video interview with Lua and the article. Check out the great set of articles in the special feature here.

  1. What led you to put together a special issue on this topic?
Lua Lopez. Assistant Professor at California State University

My first contact with ancient genomics was during my postdoc at PSU at the Lasky Lab. As soon as I started looking for literature to help me get the project started I realized that, except in the field of human ancient genomics, information was scattered and it was not easy to find methodological papers for wet-lab or bioinformatics of this type of data. We were lacking a strong foundation of studies using a combination of ancient, historical and modern samples stored in museums. Because of all this I wanted to put together an issue compiling a critical mass of studies using Natural History Collections (NCH) to advance the field of evolutionary biology. Although I had been thinking for a while about this, I only adventured to put this together when two new postdocs also working with NHC samples joined the lab, Dr. Kathryn Turner and Dr. Emilly Bellis. The three of us, together with our postdocsupervisor Jesse Lasky, decided it was time to get this running and I am very excited with the result.

2. Of the papers in the special feature, can you identify any broad trends?

All papers provide a significant advance in important methodological steps (from DNA extraction to data analysis) facilitating the use of NHC sample in evolutionary studies. The data used to test the methods in these papers provide a glimpse of the new research avenues that NHC samples can open.

3. What did you find the most surprising about the papers in this feature?

It was incredible to see how many fields can benefit from using NHC samples. This issue does not only cover methodological aspects but it shows how NHC samples can help answer long-standing questions in the fields of metagenomics, epigenetics, conservation genomics, evolutionary ecology and phylogenetics.

4. What do you recommend to researchers trying to collect genomic data from natural history collections?

Contact as many NHCs as you can. There are still many collections that are not digitized and being aware of what is available that can have a large impact in your experimental design. If this is your first time working with NHC samples, team up, genetic studies with NCH samples can be a big challenge (high risk, high reward). Having someone with experience to guide you is going to be one of the best things you can do to ensure the success of your research.

5. What do you think are crucial next research steps to effectively utilizing natural history collections?

I strongly believe that the next steps include digitizing NHC collections and archiving DNA data. Many  NHC samples are not yet digitized and researchers looking a particular species can only obtain a partial picture of what’s available for their studies. The accuracy we have to answer particular questions is, in most cases, determined by the samples we have access to (i.e. number of samples, geographical and temporal distribution).  In addition, any genetic data obtained from NHC samples should be publicly available. By having access to larger data sets we can not only increase the accuracy of our results but we can also better predict future scenarios.

6. What (if any) method advances are needed?

In the past 10-20 years, we have improved enormously in our wet lab protocols and bioinfomatics but the intrinsic nature of DNA from NHC samples means that we still have a long way to go. Ideally, we want standardize protocols for large taxonomic groups and identify what kind of factors have a larger impact in DNA damage. This also applies for pipelines for data analysis, in general the more standardized protocols are the best, it will hep us comparing results among studies and trying to identify broad evolutionary patters.

7. What would your message be for students about to start their first research projects on this topic?

Understand what kind of samples you have in your hands. It’s not only about how old they are, it’s also about how where they preserve after sampling and during storage, where are they coming from, how much material you have, etc. Many factors are going to influence the success of obtaining DNA of enough quality for downstream analysis. And the same goes for the data analysis, make sure you are considering the particular nature of the genetic data that you are analyzing. NHC samples are precious and destructive sampling cannot be done lightly. So, always do a test run and ask all the questions that you have.

Interview with the chief editor: Molecular Ecology Resources (MER)

In this special new-years post we interview the Chief Editor of MER Shawn Narum. Shawn, based at the Columbia River Inter-Tribal Fish Commission and the University of Idaho, has been chief editor for over 5 years. In this interview we get his perspective on the journal and the field in general as well as his advice for early career researchers.

See this link for a past interview with Shawn all the way back in 2014 with the Molecular Ecologist and this link for his 2020 editorial.

Image result for shawn narum

What are some of the main changes you have witnessed in the field of molecular ecology since you became Chief Editor of MER?

The advancement of molecular and statistical methods have driven the field of molecular ecology to new heights. Questions that were previously out of reach can now be addressed for most non-model species with careful study design.

What methods and resources do you think the field needs in the future?

Advances in sequencing methods have lead to fascinating discoveries of candidate genes associated with local adaptation and phenotypic variation many species, but development of candidate markers for intensive testing and validation is lacking. For example, bioinformatic resources are needed that efficiently and accurately develop primers/baits for specific subsets of markers that can be genotyped cost effectively in many individuals (e.g., Meek & Larson, 2019).

What are some of your favourite scientific discoveries from the past two decades?

Genomic islands of divergence are real! These islands often occur as inversions with low recombination that drive life history variation in organisms ranging from plants (Hoffmann & Rieseberg, 2008), birds (Lamichhaney et al., 2016), and fish (Jones et al., 2012)

As a fish geek, I also very much enjoyed the discovery that there is a warm blooded fish! It has long been known that some species like tuna and swordfish exhibit partial endothermy in brain tissue, but discovery of whole body endothermy in Opah living in cold, deep seas makes me smile (Wegner et al., 2015).

What advice would you give students wanting to develop a career in science?

Establish close collaborations with colleagues that you trust and nurture those relationships for the long-term.

What advice would you give to your younger-self about science and life?

Seize opportunities to work with others in a team environment, but it is OK to turn down some opportunities when there is already too much on your plate. “Too much” is when you can’t keep up with expectations that you have for yourself or projects substantially interfere with spending time with the people you love

What is your writing style like? Do you have some favourite writers that inspired you earlier on during your career?

My writing tends to be structured following a mental or written outline for clearly defined study questions. I have always been inspired by papers coming from Louis Bernatchez and have been grateful to have co-authored a few recent articles with him.

What are some of the aspects of your job as a scientist that you enjoy the most?

Two of the most rewarding aspects of my work are being involved with the development of young scientists and making new genomic discoveries that contribute towards conservation and recovery of naturally occurring species.

Outside of sequencing, what is your favourite methodological advance in the last five years?

Statistical advances that improve signal to noise in order to reduce false positives are critical to our field. One such approach called “Local score” was developed by Fariello et al (2017) to account for linked SNPs from high density genome scans to yield strong candidates (after Bonferroni correction). This is a powerful approach to detect adaptive genetic variation.

References

Meek, M. H., & Larson, W. A. (2019). The future is now: amplicon sequencing and sequence capture usher in the conservation genomics era. Molecular ecology resources. 19, 795–803.
https://doi.org/10.1111/1755-0998.12998

Hoffmann, A. A., & Rieseberg, L. H. (2008). Revisiting the impact of inversions in evolution: from population genetic markers to drivers of adaptive shifts and speciation?. Annual review of ecology, evolution, and systematics, 39, 21-42.
https://doi.org/10.1146/annurev.ecolsys.39.110707.173532

Lamichhaney, S., Fan, G., Widemo, F., Gunnarsson, U., Thalmann, D. S., Hoeppner, M. P., … & Chen, W. (2016). Structural genomic changes underlie alternative reproductive strategies in the ruff (Philomachus pugnax). Nature Genetics, 48(1), 84.
https://doi.org/10.1038/ng.3430

Jones, F. C., Grabherr, M. G., Chan, Y. F., Russell, P., Mauceli, E., Johnson, J., … & Birney, E. (2012). The genomic basis of adaptive evolution in threespine sticklebacks. Nature, 484(7392), 55.
https://doi.org/10.1038/nature10944

Wegner, N. C., Snodgrass, O. E., Dewar, H., & Hyde, J. R. (2015). Whole-body endothermy in a mesopelagic fish, the opah, Lampris guttatus. Science, 348(6236), 786-789.
https://doi.org/10.1126/science.aaa8902

Fariello, M. I., Boitard, S., Mercier, S., Robelin, D., Faraut, T., Arnould, C., … & Gourichon, D. (2017). Accounting for linkage disequilibrium in genome scans for selection without individual genotypes: the local score approach. Molecular ecology, 26(14), 3700-3714.
https://doi.org/10.1111/mec.14141

Interview with the author: Using host transcriptomics to sample blood parasites

Hosts offer diverse habitat for an incredibly rich array of microbial groups. Genomic resources for many groups residing within hosts (‘infra-communities’) are poor often due to the difficulty in isolating the DNA from the microbe from that of the host, particularly for species living within host cells. In this interview we go behind the scenes with Spencer Galen as he guides us through his transcriptomic approach he developed with colleagues to sample blood parasites such as malaria. Given how ubiquitous and important these parasites can be for animal health, this resource has the potential to pave the way for important advances in disease ecology. Read the paper here.

Avian blood transcriptomes revealed that hosts often have far more complex parasite communities than traditionally thought. For instance, the transcriptome of this Baltimore oriole (Icterus galbula) revealed at least six malaria parasite infections from three malaria parasite genera. The blood smear image from this bird shows the three genera in close contact within the host bloodstream. L: Leucocytozoon, PL: Plasmodium, PA: Parahaemoproteus.
Credit: Spencer Galen

What led to your interest in this topic / what was the motivation for this study? 

This study began with two classic ingredients of scientific discovery: a lot of frustration mixed with a bit of inspiration from other researchers. The frustration was born from a lack of available genetic resources for malaria parasites and other blood parasites, which I felt was hindering the kind of research that I wanted to do. The inspiration came during the first year of my PhD, when several papers were published within a span of just a few months showing that researchers were passively generating large quantities of blood parasite genomic data by sequencing the transcriptomes of their vertebrate hosts. My PhD advisor Susan Perkins and I thought that designing a study to explore this approach in more detail could solve some of my frustrations and help the field of blood parasite research at large.

What difficulties did you run into along the way? 

When we started this project there was always the looming possibility that we would sequence a number of host transcriptomes that were infected with blood parasites and simply not recover any useful parasite data. Even a small-scale transcriptomic project is not a trivial matter financially, and so I will admit that I lost some sleep wondering if this project was a bad idea. Fortunately, field and lab work went quite smoothly, and the results of my first scan for parasites within our initial test transcriptomes exceeded my wildest expectations. And so in reality the biggest challenge was my own self-doubt – if I had paid too much attention to those thoughts, this project might not have gotten off the ground.

What is the biggest or most surprising innovation highlighted in this study? 

We were astounded by just how prevalent blood parasite transcripts can be within host transcriptomes. For instance, in one bird (Vireo plumbeus sampled in the mountains of New Mexico) we found that nearly 17% of all contigs generated from the initial Trinity assembly were derived from a parasite that was infecting just 0.75% of all blood cells. A second surprising finding was the degree to which many of the birds that we sampled were infected with complex communities of parasites that we did not detect using traditional microscopic and DNA barcoding methods. Across all samples we found that transcriptomes revealed about ~20% more infections than the methods that are typically used to study these parasites. This included one individual bird that was infected by three different genera and at least six species of malaria parasite.

Moving forward, what are the next steps in this area of research?

While it is exciting to find that a transcriptomic approach can improve our ability to study the genomic diversity and abundance of wildlife blood parasites, it still remains a rather inefficient approach – at the end of the day, the majority of transcripts from each sample came from the host organism that was not the focus of our study. The next step will be to apply single-cell and other advanced RNA sequencing techniques that have successfully been applied to model systems to provide greater resolution to studies of blood parasite gene expression and host-parasite interactions.   

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? 

At risk of sounding overly pessimistic, be prepared for things to fail the first time around and have a plan B in place. It is wonderful to have a lot of confidence, but pessimism does tend to favor preparedness. Small actions within this frame of mind can save you a lot of grief in the long run, and can be as simple as testing a new method on a sample that isn’t important before you start your project or taking the time to visit a lab to learn a technique before you try it yourself. I naturally assume everything I try in the lab will fail, so each time things work (and they actually often do!) it is a pleasant surprise.

What have you learned about methods and resources development over the course of this project? 

I think that there is a difference between producing a resource, and producing a resource that is easily accessible to the broader research community in practice. As a result, I spent a lot of time thinking about how my colleagues would most directly benefit from the data that we had generated. In the end we made the data from this study available in as many formats as we thought might be useful to other researchers (raw sequences, assemblies from before and after parasite identification, curated alignments, DNA barcodes, etc.). The amount of time that it took to prepare these datasets was extremely small relative to the length of the entire project, and I think will go a long way towards making these data as useful as possible.

Describe the significance of this research for the general scientific community in one sentence.

This study improves our ability to research the ecology and evolution of wildlife blood parasites, a cosmopolitan and ubiquitous group that is widely relevant to global health.

Describe the significance of this research for your scientific community in one sentence.

The methodological framework that we present in this study profoundly improves the genomic resource base that is available to research understudied blood pathogens of wildlife, as well as better detect multi-species parasite communities within hosts.

Interview with the author: Creating the SPIKEPIPE metagenomic pipeline

Reliable abundance estimates is a significant challenge for eDNA metagenomic studies. One important issue is that sequencing introduces multiple sources of noise that can significantly alter the accuracy of abundance estimates. Here we interview Douglas Yu, a professor at the University of East Anglia, about the SPIKEPIPE pipeline recently published in Molecular Ecology Resources. This method is particularly exciting as it can use either short read barcodes or mitogenome data to estimate species abundances by accounting for sequencing noise using correction factors. They test this eDNA pipeline on arthropod samples taken from the High Arctic in Greenland and show that this approach can produce remarkably accurate species abundance estimates compared to samples of known composition. Read the full article here and get the code to run this pipeline here.

image
The 5 steps of SPIKEPIPE.

What led to your interest in this topic / what was the motivation for this study? 

We very much want to know how a heating climate is affecting biodiversity. Greenland is a direct window into this, both because heating has progressed very fast here, and because local species richness is manageable for study:  375 known aboveground arthropod species at the Zackenberg research station. Equally important, the Danish research station at Zackenberg had had the foresight to systematically collect arthropods starting in 1996, and those samples were sitting in ethanol in a warehouse in Denmark. The main obstacle to using them had been that no one could identify the hundreds of thousands of individuals to species level. Luckily, Helena Wirta and Tomas Roslin had in parallel carried out a DNA barcoding campaign at Zackenberg. Put together, we had in our hands a complete time series of community dynamics over a stretch of time during which summer had almost doubled in length. 

What difficulties did you run into along the way? 

When we started, we were all set to use metabarcoding. However, we soon learned (not surprisingly) that the sample-handling protocols had not been designed with molecular methods in mind:  the trap water was reused across time periods, the collecting net was used across traps, and the sorting trays were not bleached between samples. We thus needed a protocol that would be robust to cross-sample contamination and would ideally return quantitative information, since we wanted to detect change in population dynamics. This is why we turned to mitochondrial metagenomics (Tang et al. 2015, Crampton-Platt et al. 2016) and came up with SPIKEPIPE, which combines read-mapping, a percent-coverage detection threshold, and a spike-in to correct for pipeline stochasticity. 

What is the biggest or most surprising innovation highlighted in this study? 

The individual elements of SPIKEPIPE were reasonably well known, but what we hadn’t anticipated is just how accurate the results were when combined in a single pipeline. With mock samples, we found no false-positive species detections (when the percent-coverage threshold is applied) and recovered highly accurate estimates of intraspecific abundances (in terms of DNA mass). With resequenced environmental samples, we found high repeatability of abundance estimates across sample repeats, even though DNA extraction and Illumina library prep, sequencing, and base-calling all inject stochasticity into datafile sizes.

Also very gratifying was finding that SPIKEPIPE returned useful data even when mapping reads only to short DNA barcodes, as originally presaged by Xin et al. (2013). This means that we can make use of the existing vast DNA-barcode reference library.

Moving forward, what are the next steps in this area of research?

SPIKEPIPE is of course only the means to an end, and our next goal is the statistical analysis of community change in a rapidly heating ecosystem. Nerea Abrego and Otso Ovaskainen are now applying joint species distribution modelling (with the R package Hmsc, Tikhonov et al. 2019) to the dataset of 712 pitfall-trap samples. One important question is to quantify how much of the year-to-year variation in species abundances can be attributed to species interactions, as opposed to climate variables. 

More broadly, the result that SPIKEPIPE can be used with DNA barcodes makes possible an intriguing strategy:   one may now generate both the species reference database and the sample-by-species table from the same set of samples. We are using Greenfield et al.’s (2019) Kelpie software to carry out targeted assembly of DNA barcodes from shotgun-sequenced bulk samples, which we compile into a single DNA-barcode reference database, against which we then map reads from each sample to generate the data table. 

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? 

Build in a lot of testing:  multiple, complex mock samples for pipeline development, repeat environmental samples to measure repeatability, realistically complex positive controls, many negative controls, and many sanity checks as you work through your bioinformatic code. 

You are likely to be learning to code at the same time that you write your first pipelines. Take the extra time *now* to learn and apply robust coding techniques, even if there are easier but less robust methods available. 

Read Jenny Bryan’s tutorial on file naming:  https://speakerdeck.com/jennybc/how-to-name-files

What have you learned about methods and resources development over the course of this project? 

A great way to inspire new methods is to talk with non-molecular researchers about their scientific questions, currently used methods, and available sample types. Our team includes arctic ecologists, molecular ecologists, and a mathematician.

For one’s method to have impact, it will need to be useful for years after one first thinks of it. Stay up to date with technology trends, including costs, to avoid rapid obsolescence.

Describe the significance of this research for the general scientific community in one sentence.

We can use DNA sequencing to quantify how insect and spider communities respond to environmental change.

Describe the significance of this research for your scientific community in one sentence.

Mitochondrial metagenomics is a viable alternative to amplicon sequencing for characterising arthropod communities. 

CRISPR-Cas Diagnostics for Environmental Monitoring

In a special blog post, Molly-Ann Williams(@WilliamsMolly_9) and Anne Parle-McDermott (@anne_parle) from the School of Biotechnology and DCU Water Institute, Dublin City University provide an overview of how CRISPR-Cas works and how it can be applied to ecology and monitoring in particular. Read their recently published Molecular Ecology Resources paper here.

The field of CRISPR-Cas for genome editing has simply exploded since its introduction in 2012. The discovery of many different Cas enzymes with additional natural or genetically engineered functionalities, is resulting in an increase in CRISPR-Cas applications across all fields from food security to medicine. 

Number of Scopus search results for query “CRISPR” in given year. Search performed on 21 November 2019 .

So how can we join the revolution and apply CRISPR-Cas to the field of Ecology?

CRISPR-Cas systems consist of two main elements: a guide and a nuclease. Guides (made of RNA) direct the nuclease (Cas enzyme) to specific nucleic acid sequences (DNA or RNA). Upon target recognition the nuclease carries out the desired response, most commonly cleavage of the target sequence. The initially discovered CRISPR-Cas system relied on a nuclease called Cas9. This enzyme is involved in highly specific cleavage of target sequences that allow genome editing to occur by activating the natural repair system of the cell. More recently the applications of this system have been expanded beyond genome editing by the discovery of several new Cas enzymes with a secondary function i.e., the indiscriminate cleavage of single stranded nucleic acids upon target recognition. The discovery of these Cas enzymes has revolutionised nucleic acid diagnostics due to two main features:

Two main elements of a CRISPR-Cas diagnostic system: Cas enzyme and guide RNA effector complex and single stranded (ss) nucleic acid reporter molecule. In this example, the nuclease is Cas12a specific to DNA detection downstream from a TTTV PAM site. Adapted from Williams MA et al (2019).
  1. Protein-guide and cleavage molecules (Cas): able to specifically recognise target nucleic acids, cleave the target sequence and subsequently cleave other non-specific nucleic acids.
  2. Nucleic acids as reporters: the non-specific nucleic acids can be designed as a reporter molecule that releases measurable signal when cleaved. This allows us to visualise when the initial target sequence has been detected and apply it to diagnostics and species monitoring.

Two main elements of a CRISPR-Cas diagnostic system: Cas enzyme and guide RNA effector complex and single stranded (ss) nucleic acid reporter molecule. In this example, the nuclease is Cas12a specific to DNA detection downstream from a TTTV PAM site.

The three main Cas enzymes of interest for diagnostics are Cas12, Cas13 and Cas14 each with unique functions applicable to different types of tests (for a more detailed discussion of these enzymes visit this blog).

The Cas enzyme most relevant for single species detection from environmental DNA is the enzyme Cas12a. This nuclease can detect both ssDNA and dsDNA but can only recognise DNA sequences downstream from a TTTV protospacer adjacent motif (PAM). Importantly, Cas12a cannot detect DNA sequences missing this PAM site. This is vital when designing single species detection assays.

Do you have two closely related species that you want to distinguish? Searching your target species sequence for a site downstream of a PAM site found ONLY in your target, and not in sympatric species, will ensure highly specific recognition and prevent detection of non-target species.

What if you work with environmental RNA? Well there is a CRISPR-Cas system for you too! The Cas enzyme Cas13 differs from Cas12a in that it recognises single stranded RNA molecules with non-specific cleavage of ssRNA following target cleavage i.e., it works the same as Cas12a but targets RNA rather than DNA.

The world of CRISPR diagnostics is still in its early stages but with the discovery of new CRISPR-Cas systems with unique functions, there is no reason ecologists cannot utilise these diagnostic tools to enhance environmental monitoring using molecular techniques. For more information on using CRISPR-Cas diagnostics for single species detection from environmental DNA read our paper here.

Methods summary: Addressing (one of) the challenges of RADseq

Article by Evan McCartney-Melstad and Brad Shaffer from University of California at Los Angeles

RADseq is a great method for gathering genomic data to answer biological questions across many different scales, from phylogenetics to population and landscape genetics. It is fast, inexpensive, and requires no previous knowledge about the species’ genomic architecture. However, with this flexibility comes challenges. In this paper we develop and bench test an approach to address what may be the biggest RADseq challenge: how to choose the right sequence similarity threshold that defines whether two non-identical sequencing reads arose from the same or different genomic locations. This problem goes to the heart of evolutionary genetics— if two sequences are considered to be homologous, or derived from the same ancestral genomic location with subsequent modification through time, then they tell us a great deal about evolutionary history. If they are paralogous, and map to separate locations, then they lack that shared evolutionary history. Getting this straight is perhaps the single most important step in using genomic data for evolutionary inference.

Heat maps showing pairwise data missingness at clustering thresholds of 88% (a) and 99% (b). 

Studies that include relatively distantly related samples, such as those asking phylogenetic or biogeographical questions, should expect that homologous sequences will have diverged over time and therefore require lower similarity thresholds that allow for that divergence. However, if the threshold is set too low, paralogs will be falsely assigned to the same genomic locus, leading to problems ranging from inflated missing data rates to inaccurate measures of genetic diversity. Rather than relying on rough guesses that are preset in software packages, our approach attempts to balance these two competing forces by quantifying the relationship between pairwise genetic relatedness (as estimated directly from the data) and summaries of the RADseq dataset including pairwise data missingness and the slope of isolation by distance among samples. The relationship between pairwise genetic distance and pairwise data missingness is particularly informative—although some positive correlation is expected as mutations accumulate in enzyme restriction sites that RAD relies on, there is often a clear pattern of increased pairwise missingness that occurs when the most divergent homologous allelic variants begin to be erroneously oversplit into different presumptive loci. By explicitly looking for this breakpoint as a function of clustering threshold, researchers can choose a value that allows them to maximize the number of genomic regions recovered while minimizing the erroneous oversplitting of highly divergent, but homologous loci.

Citation: McCartney‐Melstad, E, Gidiş, M, Shaffer, HB. An empirical pipeline for choosing the optimal clustering threshold in RADseq studies. Mol Ecol Resour. 2019; 19: 1195– 1204. https://doi.org/10.1111/1755-0998.13029

Methods summary: Applying CRISPR to detect eDNA

Article by Molly-Ann Williams and Anne Parle-McDermott both from Dublin City University

We were challenged to design and build a simple and rapid species monitoring system. Why do we need such a system?  Biodiversity loss is at an all-time high and such a system would help to support the management and conservation of fish species within aquatic environments by acquiring knowledge of species distribution that traditionally is gained through visual detection and counting. These methods are expensive, time consuming and can lead to harm of the species of interest.    We decided that environmental DNA (eDNA) was the way to go but we had to solve the ‘PCR problem’ i.e., avoid having to do cyclical high temperatures as that would see us ending up with a costly, once-off device that would likely not be applied outside our lab.  This got us brainstorming and led us to a novel isothermal detection method, combining Recombinase Polymerase Amplification with CRISPR-Cas detection, which simplifies the adaptation of nucleic acid detection on to a biosensor device.

This innovative methodology utilises the collateral cleavage activity of Cas12a, a ribonuclease guided by a highly specific single CRISPR RNA, to detect specific species from eDNA. We proved it could work for eDNA by applying the technology to the detection of Salmo salar from eDNA samples collected in Irish rivers, where presence or absence had been previously confirmed using conventional field sampling. The beauty of this advance is that it can be applied to any species in the environment.  Not only does this assay solve the ‘PCR problem’, it is also is a better approach for distinguishing very closely related species.  We look forward to others in the field adapting it to their own favourite species of interest.  

Citation: Williams, M‐A, O’Grady, J, Ball, B, et al. The application of CRISPR‐Cas for single species identification from environmental DNA. Mol Ecol Resour. 2019; 19: 1106– 1114. https://doi.org/10.1111/1755-0998.13045

Interview with the author: Sociality, hyenas and DNA methylation

Adding of methyl groups to a DNA molecule or methylation has the interesting ability to alter the activity of a DNA segment without changing the sequence.  In this behind the scenes look, Zachary Laubach and colleagues test if this valuable biomarker is impacted by differences in hyena social status or other ecological factors early in life. What’s particularly impressive is that they garnered insights into methylation from a wild population. They find some surprising results, such as that high ranking mums can confer higher levels of methylation to their cubs that disappears when they get older. Why? Find out below and read the full article here.

Photo credit: Zach Laubach

What led to your interest in this topic / what was the motivation for this study? 

Across a broad taxonomic spectrum, social experiences, particularly those early in life, seem to have a profound impact on organisms’ development. The idea that during sensitive periods of development, social experiences and early life environment can have lasting impacts on the later life phenotype and health is known as the Developmental Origins of Health and Disease (DOHaD) hypothesis, and was formalized in the 1980s by epidemiologists, namely David Barker and his research on cardiovascular disease. Among social mammals, including humans and non-human primates, an individual’s social rank affects their behavior, physiology, and related health outcomes. For example, in humans, low socioeconomic status is widely recognized as a risk factor for cardiovascular complications and other chronic diseases. In non-human primates, low social rank is risk factor for elevated chronic stress and immune dysregulation. So, although we observe that social status affects biology, we still know little about how this all works. To better understand a potential mechanism for how early life environment affects biology, we investigated possible early environmental determinants of a molecular biomarker (DNA methylation) over the course of development in a population of wild spotted hyenas. Similar to many primates, hyenas live in groups organized by a social dominance hierarchy, and whether or not a hyena is born high or low ranking has lifelong consequences.

What difficulties did you run into along the way? 

In this study, we focused on measuring DNA methylation, which is generally of interest to researchers because it is responsive to environmental stimuli and associated with gene expression. Still, while spotted hyenas present a unique opportunity to investigate how various social experiences and ecological factors early in life are associated with biological characteristics later in life, there were no previous studies (at least of which we were aware) that measured DNA methylation in this species. In other words, this was not like working with a well characterized molecular biology model organism, like fruit flies or lab rats. In fact, when we were conducting our lab work there was no publicly available draft hyena genome. In our attempt to assess a potentially informative biomarker in hyenas, we measured multiple types of DNA methylation with varying degrees of success. Finally, the hyenas we study live freely in a large reserve in Kenya, so much of our data were observational and collected under a variety of field conditions making collection of samples non-trivial.

Photo credit: Zach Laubach

What is the biggest or most surprising innovation highlighted in this study? 

This work represents one of a handful of studies conducted in a wild population that measures DNA methylation to better understand how early life environment may influence organisms’ biology over the course of development. Taking advantage of our approximately 30 years’ worth of continuously collected data on individually recognizable hyenas from the Masai Mara Hyena Project, we not only amassed a particularly large sample size for a long-lived, wild mammal, but we were also able to compare patterns of DNA methylation at various stages of development with respect to multiple early life environmental factors. We found that being born to a higher-ranking mom corresponded with greater global DNA methylation in young but not older hyenas. One interpretation of this result is that high ranking moms confer some advantage to their cubs early in life, but that the effect of maternal rank per se is not evident in global DNA methylation of subadult or adult hyenas. We also found some associations between global DNA methylation and litter size, human disturbance, and prey availability in the year a hyena was born, and these associations were strongest in the youngest age group of hyenas.

Moving forward, what are the next steps in this area of research?

In our next steps we are working to understand whether specific types of early life social environments, like maternal care and how well socially connected an animal is within its group, correspond with variation in DNA methylation and adult stress. We are also utilizing more advanced techniques for measuring DNA methylation, so that we might home in on functional pathways that are involved in the development of an adverse stress phenotype. As part of our broader research agenda looking at general biological principles related to DOHaD hypothesis, we have also teamed up with epidemiologists to ask how social status in humans affects biology. In fact, we have recently published another a paper looking at the associations between maternal socioeconomic status and patterns of DNA methylation over the course of development in children who are part of the Project Viva pre-birth cohort study (check out the paper here).

Photo credit: Zach Laubach

What would your message be for students about to start developing or using novel techniques in Molecular Ecology?

This project was part of my PhD work, and from this experience I have learned just how fast molecular biology advances as a field. Given that this technology is constantly changing, it is critical to find mentors and collaborators with up-to-date expertise who are willing to support you. I was fortunate to work in a cutting-edge molecular laboratory, and to receive training from internationally recognized experts in Dr. Dana Dolinoy’s lab who specialize in studying DNA methylation. Additionally, in studies like these that involve large observational data sets and that aim to understand biological mechanisms, the value of clearly defined study questions, hypotheses and a complimentary analytical strategy cannot be understated. In my opinion, novel technology will not substitute for a thoughtful and well-planned analysis.

What have you learned about methods and resources development over the course of this project? 

Working in a novel system, like investigating DNA methylation in wild spotted hyenas, presents challenges and limitations that are unique from those encountered in laboratory settings and when working with model organisms. However, there are deep insights and rich perspective to be gained at the three-way interface between molecular biology, behavioral ecology and evolutionary biology from study populations with intact life histories and that are subject to natural selection. I have also learned that long-term field studies with uninterrupted data collection, like the Masai Mara Hyena Project, provide an invaluable resource and an unmatched opportunity to combine molecular techniques with vast collections of behavioral, demographic and ecological data. In addition, while long-term field studies represent a substantial investment of time and resources, they also present a chance for comparative research that can help elucidate basic biological principals that span taxa, like the DOHaD hypothesis. As such, I believe I have been fortunate to work with Dr. Kay Holekamp’s hyenas and that these types of long-term field studies are an asset to be prioritized and preserved.

Describe the significance of this research for the general scientific community in one sentence.

Social and ecological factors experienced early in life can correspond to changes in molecular biomarkers, like DNA methylation, that are detected over the course of development, and that may affect patterns of gene expression.

Photo credit: Zach Laubach

Describe the significance of this research for your scientific community in one sentence.

Findings from this research suggests that maternal rank, anthropogenic disturbance, and prey availability around the time of birth are associated with later life global DNA methylation in spotted hyenas, particularly in cubs.

Citation: Laubach, ZM, Faulk, CD, Dolinoy, DC, et al. Early life social and ecological determinants of global DNA methylation in wild spotted hyenas. Mol Ecol. 2019; 28: 3799– 3812. https://doi.org/10.1111/mec.15174