Methods summary: Addressing (one of) the challenges of RADseq

Article by Evan McCartney-Melstad and Brad Shaffer from University of California at Los Angeles

RADseq is a great method for gathering genomic data to answer biological questions across many different scales, from phylogenetics to population and landscape genetics. It is fast, inexpensive, and requires no previous knowledge about the species’ genomic architecture. However, with this flexibility comes challenges. In this paper we develop and bench test an approach to address what may be the biggest RADseq challenge: how to choose the right sequence similarity threshold that defines whether two non-identical sequencing reads arose from the same or different genomic locations. This problem goes to the heart of evolutionary genetics— if two sequences are considered to be homologous, or derived from the same ancestral genomic location with subsequent modification through time, then they tell us a great deal about evolutionary history. If they are paralogous, and map to separate locations, then they lack that shared evolutionary history. Getting this straight is perhaps the single most important step in using genomic data for evolutionary inference.

Heat maps showing pairwise data missingness at clustering thresholds of 88% (a) and 99% (b). 

Studies that include relatively distantly related samples, such as those asking phylogenetic or biogeographical questions, should expect that homologous sequences will have diverged over time and therefore require lower similarity thresholds that allow for that divergence. However, if the threshold is set too low, paralogs will be falsely assigned to the same genomic locus, leading to problems ranging from inflated missing data rates to inaccurate measures of genetic diversity. Rather than relying on rough guesses that are preset in software packages, our approach attempts to balance these two competing forces by quantifying the relationship between pairwise genetic relatedness (as estimated directly from the data) and summaries of the RADseq dataset including pairwise data missingness and the slope of isolation by distance among samples. The relationship between pairwise genetic distance and pairwise data missingness is particularly informative—although some positive correlation is expected as mutations accumulate in enzyme restriction sites that RAD relies on, there is often a clear pattern of increased pairwise missingness that occurs when the most divergent homologous allelic variants begin to be erroneously oversplit into different presumptive loci. By explicitly looking for this breakpoint as a function of clustering threshold, researchers can choose a value that allows them to maximize the number of genomic regions recovered while minimizing the erroneous oversplitting of highly divergent, but homologous loci.

Citation: McCartney‐Melstad, E, Gidiş, M, Shaffer, HB. An empirical pipeline for choosing the optimal clustering threshold in RADseq studies. Mol Ecol Resour. 2019; 19: 1195– 1204. https://doi.org/10.1111/1755-0998.13029

Methods summary: Applying CRISPR to detect eDNA

Article by Molly-Ann Williams and Anne Parle-McDermott both from Dublin City University

We were challenged to design and build a simple and rapid species monitoring system. Why do we need such a system?  Biodiversity loss is at an all-time high and such a system would help to support the management and conservation of fish species within aquatic environments by acquiring knowledge of species distribution that traditionally is gained through visual detection and counting. These methods are expensive, time consuming and can lead to harm of the species of interest.    We decided that environmental DNA (eDNA) was the way to go but we had to solve the ‘PCR problem’ i.e., avoid having to do cyclical high temperatures as that would see us ending up with a costly, once-off device that would likely not be applied outside our lab.  This got us brainstorming and led us to a novel isothermal detection method, combining Recombinase Polymerase Amplification with CRISPR-Cas detection, which simplifies the adaptation of nucleic acid detection on to a biosensor device.

This innovative methodology utilises the collateral cleavage activity of Cas12a, a ribonuclease guided by a highly specific single CRISPR RNA, to detect specific species from eDNA. We proved it could work for eDNA by applying the technology to the detection of Salmo salar from eDNA samples collected in Irish rivers, where presence or absence had been previously confirmed using conventional field sampling. The beauty of this advance is that it can be applied to any species in the environment.  Not only does this assay solve the ‘PCR problem’, it is also is a better approach for distinguishing very closely related species.  We look forward to others in the field adapting it to their own favourite species of interest.  

Citation: Williams, M‐A, O’Grady, J, Ball, B, et al. The application of CRISPR‐Cas for single species identification from environmental DNA. Mol Ecol Resour. 2019; 19: 1106– 1114. https://doi.org/10.1111/1755-0998.13045

Interview with the authors: Genomic signatures of sympatric speciation with historical and contemporary gene flow in a tropical anthozoan (Hexacorallia: Actiniaria)

Though increasing numbers of empirical studies suggest that sympatric speciation may be more common than previously thought, it is difficult to quantify the prevalence of sympatric speciation, since many different processes may lead to co-distributed sister species pairs. This difficulty is particularly pronounced in marine systems where there are relatively few barriers to dispersal. A recent paper by Benjamin Titus, Paul Blischak, and Marymegan Daly provides one of the first model-based investigations of sympatric speciation in a reef system. Titus and colleagues find support for cryptic diversity in the corkscrew anemone (Bartholomea annulata), and the two lineages that they recover co-occur. Model-based analyses support isolation with migration or secondary contact, suggesting that sympatric speciation may have occurred between these lineages. Finally, Titus and colleagues identify six loci that are putatively under divergent selection between these two lineages. Below, we go behind the scenes with lead author Benjamin Titus. Read the full article here.

Photo credit: Benjamin Titus.

What led to your interest in this topic / what was the motivation for this study? 
The motivation for this study evolved quite a bit from when I initially started the project. Initially, this work was part of a broader comparative phylogeographic study. However, like many poorly studied marine inverts, the anemone turned out to be a cryptic species complex that was fully co-distributed throughout its range. Since we found no obvious ecological differences between the cryptic taxa, the project shifted focus towards testing competing biogeographic diversification scenarios. Marine systems are highly dynamic, and species that diversify in allopatry can readily become co-distributed following secondary contact. Ultimately, we wanted to use model selection analyses to make objective inferences regarding the likelihood that this species diversified sympatrically versus allopatrically followed by secondary contact.

What difficulties did you run into along the way? 
Tropical anthozoans (e.g. corals, sea anemones, zoanthids, corallimorpharians) generally harbor endosymbiotic dinoflagellates, which allow these animals to thrive in the nutrient-poor waters of the tropics. Unfortunately, there is no avoiding them in field-collected samples, and the resulting DNA extractions harbor an unknown mix of anthozoan and dinoflagellate DNA. When I started this work no universal population-level markers existed for the Class Anthozoa, so we used a reduced representation sequencing approach. Thus, our resulting RADseq dataset is, presumably, an unknown mix of target and dinoflagellate DNA. Ultimately, we were really lucky there was a full genome from a closely related species that we could map our reads to so we could be confident that we were only left with anthozoan sequences.  

What is the biggest or most surprising finding from this study? 
I think there are a couple of important takeaways. The first is that coral reefs harbor an immense amount of biodiversity on a small fraction of seafloor, and in a setting with few hard barriers to dispersal. Sympatric speciation should be a major evolutionary process on coral reefs, but it’s rarely tested for explicitly. Given that different evolutionary processes can lead to similar biogeographic outcomes, our study is a rare empirical example demonstrating the importance of sympatric speciation on reefs.
The second is that this is the first range-wide phylogeographic study for a tropical sea anemone species, and our finding that Bartholomea annulata is a species complex underscores just how underdescribed sea anemone diversity likely is.

Moving forward, what are the next steps for this research? 
Our sampling here was necessarily coarse in order to cover the entire range of this species complex in the Tropical Western Atlantic. Fine scale sampling and sequencing would be nice to try and pin down any ecological differences between these cryptic taxa that may exist. Broadly, the field of marine phylogeography needs more evolutionary studies that incorporate demographic modeling into their analyses so we can better understand the relative contributions of allopatric and sympatric speciation on coral reefs.

What would your message be for students about to start their first research projects in this topic? 
Some of the most widely recognized species are actually cryptic species complexes. If you work on a poorly studied group and want to conduct population-level research, make sure you take the time to confirm you are only dealing with a single species. This is true for any group, but is especially true for marine invertebrates.

What have you learned about science over the course of this project? 
Staying on the poorly studied taxa theme, if you work on one, there’s an immense amount of basic systematic research that needs to be done. This project came out of my dissertation research, which I developed on what I thought were common and widely recognized species. A lot of my work turned into disentangling the systematics of cryptic species complexes. This is time consuming, but important so that downstream studies are framed in the proper taxonomic context.

Describe the significance of this research for the general scientific community in one sentence.
Sympatric speciation is an important, but difficult to demonstrate, evolutionary process in the marine environment.

Describe the significance of this research for your scientific community in one sentence.
Explicit tests of competing diversification scenarios are important to disentangle different evolutionary processes that can lead to similar biogeographic outcomes on coral reefs

Summary from the authors: Boomeranging around Australia: Historical biogeography and population genomics of the anti-equatorial fish Microcanthus strigatus (Teleostei: Microcanthidae)

Photo credit: Shigeru Harazaki.

The study of species and where they live is of particular interest to biologists, because it not only allows us to gain insight into genetic diversity, but also into how different populations interact. Animals with widespread distributions are often assumed to be of least concern. This can be misleading, as it does not take into account the possibility of fragmentation and population disjunction. The Stripey fish Microcanthus strigatus is one example, as it is listed as being of least concern on the IUCN Red List. Although it spans a wide distribution across the western Pacific and eastern Indian Oceans, our study suggests that populations in Western Australia, the southwest Pacific (including eastern Australia), Hawaii and East Asia are very genetically divergent. Several of these populations have been isolated since the last glacial cycle in the Pleistocene epoch, and are currently so fragmented that no contemporary genetic exchange occurs. This is of significant conservation concern as a once widespread population is revealed to consist of four cryptic groups, especially in light of evidence suggesting that the Hawaiian population is currently in decline and that the southwest Pacific population is distinct enough to warrant recognition as a different species. 

Read the full article:
Tea Y‐K, Van Der Wal C, Ludt WB, Gill AC, Lo N, Ho SYW. Boomeranging around Australia: Historical biogeography and population genomics of the anti‐equatorial fish Microcanthus strigatus (Teleostei: Microcanthidae). Mol Ecol. 2019;28:3771–3785. https://doi.org/10.1111/mec.15172

Interview with the authors: Latitudinal divergence in a wide-spread amphibian: contrasting patterns of neutral and adaptive genomic variation

It is difficult to parse the effects of demography and historic processes and the effects of selection, particularly in species that are widespread over heterogeneous environments. In this paper, Patrik Rödin‐Mörch and colleagues use reduced-representation genomic data to investigate the demographic and selective forces driving patterns of genetic diversity in the moor frog. They find evidence of two refugial linages with support for gene flow between lineages, and they find striking differences between neutral and putatively adaptive markers. Read the full article here, and see below for an interview with the authors.

What led to your interest in this topic / what was the motivation for this study? 
We are generally interested in how amphibian populations diverge along environmental gradients, in particular relating to latitude. We have previously focused on adaptive divergence in phenotypic traits relating to growth and development in this system. Amphibians occurring at higher latitudes are very constrained by seasonality and differences in thermal regimes as well as other aspects of the environment, and this should result in strong selection to cope with these constraints. In northern Europe, populations also have a history of glacially mediated range expansions and we are very interested in how this influences divergence along the gradient. Amphibians are very good organisms to study local adaptation as a number of species have quite wide distributions where they occur in different habitat types and thermal regimes with large differences in season length. We wanted to build on previous research by taking a more genome-wide approach that would enable us to detect signatures of divergent selection, explore the distribution of genetic variation along the gradient and model the post-glacial demographic history of the populations.

What difficulties did you run into along the way?
Applying a custom ddRAD library prep protocol on R.arvalis for the first time was a bit challenging in the beginning, as the protocol was put together in another lab for another organism. Because of the large genome of this species, it was challenging settling on which restriction enzyme combination to use and how many fragments that would result in, as we wanted to multiplex ~150 individuals and had limited funds for sequencing. We also wanted to sample populations over the contact zone to get a more comprehensive look at what’s going on there, but finding populations in between the two edge regions of the contact zone was ultimately unsuccessful.

What is the biggest or most surprising finding from this study? 
The findings that intrigued us the most was the contrasting way neutral and putatively adaptive  genetic variation is distributed along the gradient. Particularly so over the post-glacial contact zone, both in terms of nucleotide diversity and based on hybrid index estimation. We were also very pleased that we obtained good support for a model that describes what we initially thought was the correct post-glacial demographic scenario, involving two lineages diverging before the last glacial maximum. After divergence they colonized Scandinavia from two different directions, with gene flow occurring over a contact zone that we could place further south than previously proposed.

Moving forward, what are the next steps for this research? 
In order to continue this work, the next step will be to replicate the latitudinal gradient on the eastern side of the Baltic sea, as well as obtaining samples across the contact zone. The plan is also to move away from ddRAD seq to RNA-seq, and eventually whole genome sequencing. We are currently planning to look at how gene expression as well as SNP variation differs with latitude and combine that information with common garden experiments on larval life-history variation. Ultimately we want to understand the genetic basis of local adaptation based on larval life-history variation and how the demographic effects of post-glacial range expansion has influenced that.

What would your message be for students about to start their first research projects in this topic? 
Make sure you know the literature. Many previous studies have investigated adaptive divergence along various environmental gradients for a number of species, including amphibians in different settings. Also, be prepared to conduct extensive field work, common garden experiments, lab work and bioinformatics, and make sure you have collaborators that can help you out.

What have you learned about science over the course of this project? 
That things usually never work out like you first planned, and sometimes you need to adjust your conceptual and methodological approach as you go along. Another important lesson is the value of collaboration and relying on other people’s expertise and skills.

Describe the significance of this research for the general scientific community in one sentence.
Amphibian populations extending their distribution range northwards after the last ice age have adapted to the environmental constraints experienced at higher latitudes and this has influenced the distribution of genetic variation along the gradient.

Describe the significance of this research for your scientific community in one sentence.
We find neutral and putatively adaptive gene flow over a post-glacial contact zone within a single species and together with strong environmental constraints and historical range dynamics this has shaped patterns of contrasting genetic variation and adaptive divergence along the gradient.

Full article:

Rödin‐Mörch P, Luquet E, Meyer‐Lucht Y, Richter‐Boix A, Höglund J, Laurila A. Latitudinal divergence in a widespread amphibian: Contrasting patterns of neutral and adaptive genomic variation. Mol Ecol. 2019;28:2996–3011. https://doi.org/10.1111/mec.15132

Interview with the author: Sociality, hyenas and DNA methylation

Adding of methyl groups to a DNA molecule or methylation has the interesting ability to alter the activity of a DNA segment without changing the sequence.  In this behind the scenes look, Zachary Laubach and colleagues test if this valuable biomarker is impacted by differences in hyena social status or other ecological factors early in life. What’s particularly impressive is that they garnered insights into methylation from a wild population. They find some surprising results, such as that high ranking mums can confer higher levels of methylation to their cubs that disappears when they get older. Why? Find out below and read the full article here.

Photo credit: Zach Laubach

What led to your interest in this topic / what was the motivation for this study? 

Across a broad taxonomic spectrum, social experiences, particularly those early in life, seem to have a profound impact on organisms’ development. The idea that during sensitive periods of development, social experiences and early life environment can have lasting impacts on the later life phenotype and health is known as the Developmental Origins of Health and Disease (DOHaD) hypothesis, and was formalized in the 1980s by epidemiologists, namely David Barker and his research on cardiovascular disease. Among social mammals, including humans and non-human primates, an individual’s social rank affects their behavior, physiology, and related health outcomes. For example, in humans, low socioeconomic status is widely recognized as a risk factor for cardiovascular complications and other chronic diseases. In non-human primates, low social rank is risk factor for elevated chronic stress and immune dysregulation. So, although we observe that social status affects biology, we still know little about how this all works. To better understand a potential mechanism for how early life environment affects biology, we investigated possible early environmental determinants of a molecular biomarker (DNA methylation) over the course of development in a population of wild spotted hyenas. Similar to many primates, hyenas live in groups organized by a social dominance hierarchy, and whether or not a hyena is born high or low ranking has lifelong consequences.

What difficulties did you run into along the way? 

In this study, we focused on measuring DNA methylation, which is generally of interest to researchers because it is responsive to environmental stimuli and associated with gene expression. Still, while spotted hyenas present a unique opportunity to investigate how various social experiences and ecological factors early in life are associated with biological characteristics later in life, there were no previous studies (at least of which we were aware) that measured DNA methylation in this species. In other words, this was not like working with a well characterized molecular biology model organism, like fruit flies or lab rats. In fact, when we were conducting our lab work there was no publicly available draft hyena genome. In our attempt to assess a potentially informative biomarker in hyenas, we measured multiple types of DNA methylation with varying degrees of success. Finally, the hyenas we study live freely in a large reserve in Kenya, so much of our data were observational and collected under a variety of field conditions making collection of samples non-trivial.

Photo credit: Zach Laubach

What is the biggest or most surprising innovation highlighted in this study? 

This work represents one of a handful of studies conducted in a wild population that measures DNA methylation to better understand how early life environment may influence organisms’ biology over the course of development. Taking advantage of our approximately 30 years’ worth of continuously collected data on individually recognizable hyenas from the Masai Mara Hyena Project, we not only amassed a particularly large sample size for a long-lived, wild mammal, but we were also able to compare patterns of DNA methylation at various stages of development with respect to multiple early life environmental factors. We found that being born to a higher-ranking mom corresponded with greater global DNA methylation in young but not older hyenas. One interpretation of this result is that high ranking moms confer some advantage to their cubs early in life, but that the effect of maternal rank per se is not evident in global DNA methylation of subadult or adult hyenas. We also found some associations between global DNA methylation and litter size, human disturbance, and prey availability in the year a hyena was born, and these associations were strongest in the youngest age group of hyenas.

Moving forward, what are the next steps in this area of research?

In our next steps we are working to understand whether specific types of early life social environments, like maternal care and how well socially connected an animal is within its group, correspond with variation in DNA methylation and adult stress. We are also utilizing more advanced techniques for measuring DNA methylation, so that we might home in on functional pathways that are involved in the development of an adverse stress phenotype. As part of our broader research agenda looking at general biological principles related to DOHaD hypothesis, we have also teamed up with epidemiologists to ask how social status in humans affects biology. In fact, we have recently published another a paper looking at the associations between maternal socioeconomic status and patterns of DNA methylation over the course of development in children who are part of the Project Viva pre-birth cohort study (check out the paper here).

Photo credit: Zach Laubach

What would your message be for students about to start developing or using novel techniques in Molecular Ecology?

This project was part of my PhD work, and from this experience I have learned just how fast molecular biology advances as a field. Given that this technology is constantly changing, it is critical to find mentors and collaborators with up-to-date expertise who are willing to support you. I was fortunate to work in a cutting-edge molecular laboratory, and to receive training from internationally recognized experts in Dr. Dana Dolinoy’s lab who specialize in studying DNA methylation. Additionally, in studies like these that involve large observational data sets and that aim to understand biological mechanisms, the value of clearly defined study questions, hypotheses and a complimentary analytical strategy cannot be understated. In my opinion, novel technology will not substitute for a thoughtful and well-planned analysis.

What have you learned about methods and resources development over the course of this project? 

Working in a novel system, like investigating DNA methylation in wild spotted hyenas, presents challenges and limitations that are unique from those encountered in laboratory settings and when working with model organisms. However, there are deep insights and rich perspective to be gained at the three-way interface between molecular biology, behavioral ecology and evolutionary biology from study populations with intact life histories and that are subject to natural selection. I have also learned that long-term field studies with uninterrupted data collection, like the Masai Mara Hyena Project, provide an invaluable resource and an unmatched opportunity to combine molecular techniques with vast collections of behavioral, demographic and ecological data. In addition, while long-term field studies represent a substantial investment of time and resources, they also present a chance for comparative research that can help elucidate basic biological principals that span taxa, like the DOHaD hypothesis. As such, I believe I have been fortunate to work with Dr. Kay Holekamp’s hyenas and that these types of long-term field studies are an asset to be prioritized and preserved.

Describe the significance of this research for the general scientific community in one sentence.

Social and ecological factors experienced early in life can correspond to changes in molecular biomarkers, like DNA methylation, that are detected over the course of development, and that may affect patterns of gene expression.

Photo credit: Zach Laubach

Describe the significance of this research for your scientific community in one sentence.

Findings from this research suggests that maternal rank, anthropogenic disturbance, and prey availability around the time of birth are associated with later life global DNA methylation in spotted hyenas, particularly in cubs.

Citation: Laubach, ZM, Faulk, CD, Dolinoy, DC, et al. Early life social and ecological determinants of global DNA methylation in wild spotted hyenas. Mol Ecol. 2019; 28: 3799– 3812. https://doi.org/10.1111/mec.15174