CRISPR-Cas Diagnostics for Environmental Monitoring

In a special blog post, Molly-Ann Williams(@WilliamsMolly_9) and Anne Parle-McDermott (@anne_parle) from the School of Biotechnology and DCU Water Institute, Dublin City University provide an overview of how CRISPR-Cas works and how it can be applied to ecology and monitoring in particular. Read their recently published Molecular Ecology Resources paper here.

The field of CRISPR-Cas for genome editing has simply exploded since its introduction in 2012. The discovery of many different Cas enzymes with additional natural or genetically engineered functionalities, is resulting in an increase in CRISPR-Cas applications across all fields from food security to medicine. 

Number of Scopus search results for query “CRISPR” in given year. Search performed on 21 November 2019 .

So how can we join the revolution and apply CRISPR-Cas to the field of Ecology?

CRISPR-Cas systems consist of two main elements: a guide and a nuclease. Guides (made of RNA) direct the nuclease (Cas enzyme) to specific nucleic acid sequences (DNA or RNA). Upon target recognition the nuclease carries out the desired response, most commonly cleavage of the target sequence. The initially discovered CRISPR-Cas system relied on a nuclease called Cas9. This enzyme is involved in highly specific cleavage of target sequences that allow genome editing to occur by activating the natural repair system of the cell. More recently the applications of this system have been expanded beyond genome editing by the discovery of several new Cas enzymes with a secondary function i.e., the indiscriminate cleavage of single stranded nucleic acids upon target recognition. The discovery of these Cas enzymes has revolutionised nucleic acid diagnostics due to two main features:

Two main elements of a CRISPR-Cas diagnostic system: Cas enzyme and guide RNA effector complex and single stranded (ss) nucleic acid reporter molecule. In this example, the nuclease is Cas12a specific to DNA detection downstream from a TTTV PAM site. Adapted from Williams MA et al (2019).
  1. Protein-guide and cleavage molecules (Cas): able to specifically recognise target nucleic acids, cleave the target sequence and subsequently cleave other non-specific nucleic acids.
  2. Nucleic acids as reporters: the non-specific nucleic acids can be designed as a reporter molecule that releases measurable signal when cleaved. This allows us to visualise when the initial target sequence has been detected and apply it to diagnostics and species monitoring.

Two main elements of a CRISPR-Cas diagnostic system: Cas enzyme and guide RNA effector complex and single stranded (ss) nucleic acid reporter molecule. In this example, the nuclease is Cas12a specific to DNA detection downstream from a TTTV PAM site.

The three main Cas enzymes of interest for diagnostics are Cas12, Cas13 and Cas14 each with unique functions applicable to different types of tests (for a more detailed discussion of these enzymes visit this blog).

The Cas enzyme most relevant for single species detection from environmental DNA is the enzyme Cas12a. This nuclease can detect both ssDNA and dsDNA but can only recognise DNA sequences downstream from a TTTV protospacer adjacent motif (PAM). Importantly, Cas12a cannot detect DNA sequences missing this PAM site. This is vital when designing single species detection assays.

Do you have two closely related species that you want to distinguish? Searching your target species sequence for a site downstream of a PAM site found ONLY in your target, and not in sympatric species, will ensure highly specific recognition and prevent detection of non-target species.

What if you work with environmental RNA? Well there is a CRISPR-Cas system for you too! The Cas enzyme Cas13 differs from Cas12a in that it recognises single stranded RNA molecules with non-specific cleavage of ssRNA following target cleavage i.e., it works the same as Cas12a but targets RNA rather than DNA.

The world of CRISPR diagnostics is still in its early stages but with the discovery of new CRISPR-Cas systems with unique functions, there is no reason ecologists cannot utilise these diagnostic tools to enhance environmental monitoring using molecular techniques. For more information on using CRISPR-Cas diagnostics for single species detection from environmental DNA read our paper here.

Summary from the authors: genetic architecture of sexual dimorphism in an interspecific cross

The evolution of differences among females and males or sexual dimorphism (SD) is very common in animals but rare in plants. These differences emerge because there is a conflict of interests between sexes to maximize their reproductive success. Thus,  moving genes of reproductive traits to low recombining regions such as the sex chromosomes might be one way to solve this conflict at the genomic level. Closely related species with young sex chromosomes, which differ in the degree of SD, are ideal systems to explore the underlining genetic architecture of SD. We have crossed a female from Silene latifolia with marked SD with a male from S. dioica with less SD. We performed a QTL analysis of reproductive and vegetative traits in the F2 hybrids to find out if sexually dimorphic traits are located on the sex chromosomes, and how they contribute to species differences. Our results support that evolutionary young sex chromosomes are important for the expression of both SD and species differences. Moreover, transgressive segregation (traits with extreme values) and a reversal of SD in the F2s indicated that SD is constrained within the species but not in the recombinant hybrids. Sexual selection can, thus, contribute to speciation.

Full article: Baena-Díaz F, Zemp N, Widmar A. 2019. Insights into the genetic architecture of sexual dimorphism from an interspecific cross between two diverging Silene (Caryophyllaceae) species. Molecular ecology. https://doi.org/10.1111/mec.15271

Interview with the authors: Massive introgression of major histocompatibility complex (MHC) genes in newt hybrid zones

Hybridization is a mechanism by which adaptive alleles can cross species boundaries and possibly boost the adaptive potential of hybridizing species. This may be especially true for alleles that confer a selective advantage when rare, which is common among major histocompatibility complex (MHC) genes involved in pathogen defense. We therefore would expect MHC genes to introgress across hybridizing species relatively easily, though there exists relatively few examples supporting this hypothesis. In this paper from Molecular Ecology, Katarzyna Dudek, Tomasz Gaczorek, Piotr Zieliński, and Wiesław Babik document the extent of introgression in MHC variants across two hybridizing European newts across replicated transects. Read below for a behind-the-scenes look at their paper!

Link to the study: https://onlinelibrary.wiley.com/doi/full/10.1111/mec.15254

F1 hybrid male. Photo from M. Niedzicka

What led to your interest in this topic / what was the motivation for this study? 
The evolutionary significance of adaptive introgression is increasingly appreciated and many examples have been described, but few generalizations are available. There is a relatively well understood mechanism – novel/rare allele advantage – which should promote introgression of genes evolving under balancing selection (a prime example of these are MHC genes). However balancing selection itself produces signatures resembling introgression, so convincing demonstration of introgression in genes under balancing selection is difficult. Hybrid zones, especially in the form of replicated transect, are among the best tools you can imagine for such a project. And we’ve been studying these newts for some time – in a way this study was motivated by our long standing interest in adaptive introgression, but it’s an off-shoot of another project (see the paper in the same issue of Mol. Ecol.).

What difficulties did you run into along the way? 
The most difficult part was the design and justification of simulations that we used to rule out explanations alternative to introgression. Because MHC in newts is multi-locus and shows extensive copy number variation, it’s been difficult to design simulations that would at the same be time realistic and feasible. This may sound surprising, but genotyping and interpretation of MHC variation has not been a major problem, although the system is quite complicated. It seems that the field has matured enough that exon-based genotyping of MHC variation has become a standard. Another frontier would be population genetic analysis of entire MHC haplotypes, extremely interesting but currently beyond reach in non-model (and most model) taxa.

Field sampling. Photo from M. Liana

What is the biggest or most surprising finding from this study? 
The scale of apparently adaptive introgression. It’s not only that MHC variants introgress – we have suspected this before. One could expect that a single or a handful of novel, introgressed MHC haplotypes would be favoured in the recipient species, but we found massive introgression, apparently involving tens or more haplotypes, most likely in both directions. It’s been quite a surprise for us – this suggests that introgression can really remodel MHC variation in hybridizing species – an influx of large amount of variation may cause species to share, at least locally, pool of MHC variation.

Moving forward, what are the next steps for this research?
A natural next step is to test generality of our findings. The mechanism of novel/rare allele advantage should operate rather universally. If so, we expect that MHC genes will be among the last genes to stop introgressing between species that still hybridize, but are strongly reproductively isolated genome-wide. In other words we expect MHC introgression should be detectable (and perhaps strong) in systems, where despite hybridization, there is very little genome-wide introgression. We’ve been lucky to obtain funding for a collaborative project, in which we are going to test this prediction using over twenty hybrid zones from major vertebrate groups. We’d also like to look at the process at the level of entire haplotypes, but this would need to wait until technologies mature.

Albino L. montandoni male. Photo from W. Babik

What would your message be for students about to start their first research projects in this topic?
The most important would probably be: have your questions worked out and if you find a system that is good to address them – go for it. Try to understand the available theory, there’s nothing more practical than good theory to guide you and to save countless hours of your precious time. And finally, start writing before you think you’re ready. Writing is the best way to have your ideas clear, to spot weak points and see things you didn’t realized before.

What have you learned about science over the course of this project? 
Over and over again – that science is unpredictable. That reality mocks your well laid out ideas and plans, twisting and turning your paths, but if you recognize and follow the opportunities that appear on the way, everything will be fine :). For example something that appears as an offshoot of a major project may turn out at least equally interesting and important. Two key components are good and diverse collaboration and the scale of research appropriate to your question – that is just large enough to provide sound answers, but not necessarily larger.

Field sampling pt 2. Photo from W. Babik

Describe the significance of this research for the general scientific community in one sentence.
Our research suggests that MHC introgression may be a widespread process that introduces novel and restores previously lost variation, boosting the adaptive potential of hybridizing taxa.

Citation
Dudek, K., Gaczorek, T.S., Zieliński, P. and Babik, W., 2019. Massive introgression of MHC genes in newt hybrid zones. Molecular Ecology. 28(21). 4798-4810. https://onlinelibrary.wiley.com/doi/full/10.1111/mec.15254

Interview with the authors: RAD‐sequencing for estimating genomic relatedness matrix‐based heritability in the wild: A case study in roe deer

Working on non-model organisms comes with both challenges and rewards. While the joy and satisfaction of uncovering knowledge in wild populations drives many scientists, the lack of genomic resources can be a roadblock for many important research themes, such as determining the extent of evolutionary potential and response to selection. In this paper from Molecular Ecology Resources, Laura Gervais and co-authors demonstrate the potential for RAD-sequencing to overcome these challenges and estimate heritability and evolutionary potential in wild populations, even for non-model organisms without many existing genomic resources. Read below for a behind-the-scenes look at their paper!

Link to the study: https://onlinelibrary.wiley.com/doi/full/10.1111/1755-0998.13031

Image result for Capreolus capreolus
Photo of male and female roe deer (Capreolous capreolus) from Wikimedia Commons

What led to your interest in this topic / what was the motivation for this study? 
We are interested in how natural populations adapt to environmental changes. These changes occur rapidly and there is an urge to accumulate results on wild populations’ capacity of adaption for a wide range of species. Traditionally, measuring the evolutionary potential of a trait required long-term field surveys of phenotypic data and genetic relatedness obtained from a multi-generational pedigree. This is challenging to obtain because many free-ranging populations are hard to sample with the intensity required for pedigree reconstruction. We believe that genome-wide data and in particular RAD-sequencing data might be an opportunity to overcome this issue but we still lack an accessible practical framework to go from genomic data to the estimation of a population’s evolutionary potential.

What difficulties did you run into along the way? 
We had to overcome two main methodological difficulties. First, to investigate the effects of the sequencing strategy and the SNP calling/filtering procedure ultimately on GRM-based heritability, we had to run a considerable amount of bioinformatic and quantitative genetic analyses, which both proved to be time consuming. Secondly, there was not much methodology available on how to implement genomic relatedness matrix in a quantitative genetic linear mixed model. We hope that our work will make this approach more easily accessible.

What is the biggest or most surprising finding from this study? 
When we started the study, we did not expect that it would be possible to run genomic quantitative genetic analyses with only a few hundred individuals. Most of our colleagues were skeptical when we mentioned that we found significant heritability (at the beginning with only 170 genotyped individuals). Our results give hope that evolutionary potential studies in the wild might be virtually accessible for any natural population when using the appropriate sampling and sequencing design.

Moving forward, what are the next steps for this research?
We are working to combine genome-wide data with intensive bio-logging technology (data on animal movement) and high-resolution habitat information. The synergy between these three high-density data technologies offers a great opportunity to understand how species adapt to environmental changes across complex landscapes.

What would your message be for students about to start their first research projects in this topic?
Our message would be to never hesitate to contact people and surround yourself with all the necessary help. This is a domain that evolves rapidly and that is very exciting but may be quite disconcerting. It seems essential to remain informed and open-minded. Lastly, I would say that self-learning is really rewarding but that there is always the opportunity to ask for help to learn and get over a problem efficiently.

What have you learned about science over the course of this project? 
We have learned that more interdisciplinary exchanges between ecologists, molecular biologists and bioinformaticians are useful and can help to build such an integrative approach. This may be challenging as they often have different views on different issues that need to be conciliated. There is a need to meet and exchange ideas to get the most out of this type of projects.

Describe the significance of this research for the general scientific community in one sentence.
This study sheds light on a unique opportunity to evaluate whether species have the genetic potential to adapt to environmental changes, and this for virtually any non-model organism.

Citation
Gervais, L., Perrier, C., Bernard, M., Merlet, J., Pemberton, J. M., Pujol, B., & Quéméré, E. RAD‐sequencing for estimating genomic relatedness matrix‐based heritability in the wild: A case study in roe deer. Molecular Ecology Resources. 19(5). 1205-1217. https://onlinelibrary.wiley.com/doi/abs/10.1111/1755-0998.13031

Interview with the authors: testing the role of ecological selection on color pattern variation

The color variation that exists among individuals has lent itself to the study of selection since Darwin. Recently, Zaman, Hubert, and Schoville (2019) investigated the effects of selection on the diversity of the wing color pattern in the butterfly Parnassius clodius across a large portion of its range. These researchers found evidence supporting the idea that coloration may serve as a warning signal to predators, providing some predator avoidance benefits to individuals. In addition, the variation of solar radiation and precipitation observed geographically across sites was negatively correlated with the amount of melanin observed at each site. This suggests that the occurrence of melanin may provide a selective advantage in the form of thermoregulatory function. For more information on how selection influences butterfly wing coloration, please see the full article and the interview with Dr. Schoville below. 

Parnassius clodius mating, male located below female. Photo by Sean Schoville.

What led to your interest in this topic / what was the motivation for this study? Khuram and I were both interested in butterfly color pattern variation, and in particular, cases where there might be competing selective pressures acting on wing pattern phenotypes. Most work on butterfly wing patterns focuses on predator-prey interactions and aposematic colors, but butterfly wings are essential to flight performance and important in thermal regulation. A number of recent papers have shown that butterfly color pattern appears to be responding to climate warming, and then there are well-known cases (such as alpine Colias, thanks to Ward Watt) where thermoregulation has been linked to basking behavior and the pigment on wings. Thus, in examining variation in Parnassius clodius (which occurs over a broad elevation and latitudinal range), we hoped that we could decouple environmental signals that might act on different wing color elements. 

What difficulties did you run into along the way? Sampling our butterflies across this large region was a major challenge, particularly as adult flight times are rather short (two to three weeks). And then we were surprised by the strong difference in adult phenology across sites (some adults are active in May, others in late July). This is evident regionally (Utah versus Washington), and across elevation within a region. In the end, it required three years of effort, with some very long road trips from Wisconsin.

What is the biggest or most surprising innovation highlighted in this study? After some initial efforts to link aposematic variation (red eyespots) indirectly to predator communities (through climate variables that might covary with predator abundance), we realized this was too tenuous. So, we were delighted to discover publicly available data on bird abundance. While this did not solve the problem (perhaps due to lack of spatial resolution in the bird data), I think using this data to analyze butterfly wing patterns was one of the more innovative aspects of our paper. We had a much easier time linking spatial climate data to melanization (dark pigmentation). As an aside, this raises the important point that some data, i.e. abiotic environmental data, is much easier to come by than biotic data. This is unfortunate, as we expect biotic selective forces to be equally or more important drivers of microevolution in some cases.

Lead author on this study, Khuram Zaman, after a day of collecting samples. Photo by Sean Schoville.

Moving forward, what are the next steps in this area of research? We’d like to extend this work in two directions. First, we’d like to connect color pattern traits to underlying genes and measures of heritability. Although the genes controlling butterfly color pattern are well studied, to date no representative of the snow Apollo subfamily Parnassinae have been included in these efforts. Members of the family are tremendously variable and quite stunning in their dramatic contrasts of color. Second, experiments are needed to link our inferences of ecological selection to fitness differences, as well as performance in the field. Physiological assays of melanic variants, coupled with mechanistic thermodynamic models, have been developed for Colias butterflies (see Joel Kingsolver and Lauren Buckley’s work). This type of modeling could provide important connections to conservation of Parnassius clodius populations under changing climates, and might perhaps extend to conservation work on other highly threatened Parnassius species.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? The development of novel approaches is a key part of advancing biological knowledge, but it can be a daunting endeavor given the breadth and scope of the scientific literature nowadays. Integrating multiple approaches, on the other hand, can equally help to advance our knowledge and provide opportunities to address long-standing questions. This is the direction we took in this paper. My personal view is that students should to try to master multiple techniques (assemble a toolkit, so to speak) and apply those techniques to fundamental problems. Hopefully, it’s a lot of fun in the process and leads to interesting collaborations!

What have you learned about methods and resources development over the course of this project? We are entering a golden age of data-rich resources, in terms of spatial environmental data and genomics data. These increasingly provide the power to test refined hypotheses about evolutionary and ecological processes, and are becoming more accessible to all researchers. One of my favorite accomplishments in the paper is using genetic covariance data among populations (relatedness data) as a covariate in fitting morphology ~ environment models. The use of such population contrasts is important in controlling for non-independence in the data due to ancestry. While we have known about the importance of genetic covariance in hypothesis testing for some time (thanks to Joseph Felsentstein’s work), it is only recently possible to use genome-wide data. This provides very precise measures that are highly informative, and enabled us to rule out the role of genetic drift as a driver of wing pattern variation. 

Describe the significance of this research for the general scientific community in one sentence. Our research demonstrates that butterfly wing color patterns evolve in response local climate conditions, as a way to regulate body temperature.

Describe the significance of this research for your scientific community in one sentence. Our work demonstrates that elements of butterfly wing pattern phenotypes respond independently to different sources of selection, with climate variation acting on thermoregulatory ability as an important driver of butterfly color pattern.

Parnassius clodius basking to thermoregulate. Photo by Sean Schoville.

Zaman K, Hubert MK, Schoville SD. 2019. Testing the role of ecological selection on color pattern variation in the butterfly Parnassius clodius. Molecular Ecology 28:50586-5102.

Interview with the authors: Environmental heterogeneity and not vicariant biogeographic barriers generate community‐wide population structure in desert‐adapted snakes

Phylogeographic studies have long focused on striking biogeographic barriers, and comparative phylogeography often looks for shared divergence across such barriers as evidence of shared responses to similar environments across taxa. However, in addition to such barriers, geographic distances and local adaptation to environmental heterogeneity may shape genetic divergence. In their recent Molecular Ecology paper, Myers and colleagues collect genomic data from 13 co-distributed species of snakes from Southwestern North America and evaluate the relative importance of biogeographic barriers, geographic distance, and environmental heterogeneity in structuring genetic divergence. Much of the previous phylogeographic work in this region has focused on divergence across a prominent biogeographic barrier: the Cochise Filter Barrier (CFB), which separates the Sonoran and Chihuahuan Deserts, and divergence across this barrier has been suggested to be an important factor driving divergence in snakes from the region. Though they expected to find a prominent role of this barrier, instead, Myers and colleagues find strong support for geographic distance and environmental heterogeneity as important factors structuring genetic divergence, but less support for biogeographic barriers. Further, they find that different variables contribute most to divergence across the 13 taxa studied, highlighting the importance of species-specific responses to environmental variation. Read the full article here, and read below for a behind-the-scenes interview with lead author Edward Myers.

What led to your interest in this topic / what was the motivation for this study? 
As a research team we have a general interest in what factors are promoting population genetic differentiation and whether codistributed species have similar evolutionary histories in response to shared environmental changes over time. Specifically, in this system where there is a well known biogeographic barrier (Cochise Filter Barrier; CFB), we were interested in whether entire assemblages of taxa show similar population structure. Initially the motivation for this study was to assess the degree of co-divergence across the CFB, however, as we analyzed these data it became clear that we needed to incorporate spatial and environmental data to understand population divergence. This study also allowed me spend a significant amount of time in the field collecting tissue samples from snakes!

What difficulties did you run into along the way? 
One of the biggest difficulties with this study was handling and analyzing all the generated data. We had almost 400 samples sequenced for radseq, so processing and analyzing these data took a significant amount of computational time. Also, one difficulty was the logistics of collecting fresh tissue samples for all of these species across the southwestern US and northern Mexico, but issues like this are easily over come by collaborating.

What is the biggest or most surprising finding from this study? 
The biggest surprise from this study is that patterns of isolation-by-distance and isolation-by-environment are more important in explaining population genetic differentiation than a commonly cited biogeographic barrier. This result really stresses the importance of incorporating spatial analyses when analyzing phylogeographic data because aspatial analyses may result in spurious results of population structure and mislead our ideas of what is driving population divergence and speciation.

Moving forward, what are the next steps for this research? 
Moving forward I plan to generate whole genome sequence data for species within this system to understand what loci may be under selection in response to environmental heterogeneity. Given the strong signature of IBE I expect to find patterns of strong selection along transects of temperature and precipitation across the Sonoran and Chihuahuan Deserts. Further, I am interested in how other regions globally that have been cited as important biogeographic barriers in phylogeographic studies might also be strongly influenced by patterns of IBD and IBE, and not vicariant barriers.

What would your message be for students about to start their first research projects in this topic? 
There is so much great work published in the field of landscape genetics and comparative phylogeography and I would suggest that students start by combing through that work first. But as general advice I would suggest that students really explore their data in a meaningful way and spend some time thinking about what factors could be responsible for similar patterns observed in a genomic data set (e.g., IBD vs vicariance or selection vs historical demography).

What have you learned about science over the course of this project? 
I have really learned that genomic data should be carefully analyzed as to not be influenced by preconceived ideas of the system that you might be working within. Also, I think that this is becoming more and more true, but you have to collaborate in order to do great science.

Describe the significance of this research for the general scientific community in one sentence.
This work demonstrates that codistributed species do not have shared evolutionary histories, and that they do not respond to the same landscape and shared environment in similar ways.

Describe the significance of this research for your scientific community in one sentence.
Our work shows that simple patterns of isolation-by-distance and isolation-by-environment have contributed to population genetic differentiation more so than commonly cited biogeographic barriers.

Full article: Myers EA, Xue AT, Gehara M, et al.Environmental heterogeneity and not vicariant biogeographic barriers generate community‐wide population structure in desert‐adapted snakes. Mol Ecol. 2019;28:4535–4548. https://doi.org/10.1111/mec.15182

Interview with the authors: quality and quantity of genetic relatedness data affect the analysis of social structure

Understanding the influence of relatedness on fine-scale social interactions within a population is fundamental to understanding the role of kinship in animal societies. In this study, Foroughirad et al provide insight into the quality of Single Nucleotide Polymorphism (SNP) data required to obtain accurate and precise parentage assignments and relatedness coefficients using data from a long‐term behavioural study on bottlenose dolphins with a known partial pedigree. They then go on to explore how the quality of these estimates influence post-hoc analyses exploring the relationship between relatedness and social structure. Again, they provide important practical guidance about the quality of data needed for these types of analyses. This article was published in Molecular Ecology Resources: read the full article here, and read our interview with Vivienne Foroughirad, lead author of the study, below.

An adult female bottlenose dolphin with her six-month old calf. Photo Credit: Vivienne Foroughirad

What led to your interest in this topic / what was the motivation for this study? 

In the broadest sense my research interests concern the evolution of sociality and complex social behaviors such as cooperation. To that end, I was interested in our ability to parcel out contexts in which cooperation occurs between kin versus between unrelated individuals. Non-kin cooperation is rare is animal societies, and a common way to search for examples is to first investigate the link between the strength of social relationships and the genetic relatedness of pairs. Since genotyping-by-sequencing is now cheaper and more accessible than ever, I wanted to explore the effects this increased resolution would have on our power to test the relationship between social structure and relatedness, especially in viscous populations with strong philopatry.

What difficulties did you run into along the way? 

In our case, the greatest challenges centered around maintaining a longitudinal study on a wild marine mammal with a large enough sample size to make answering these types of questions feasible. We were lucky to have over 30 years of data available from the Shark Bay Dolphin Project which allowed us to verify some of the reconstructed pedigree relationships, as well as measure detailed home range usage and social associations. An analytical difficulty we encountered is how to account for the confounding effect of philopatry or limited dispersal on social relationships with kin if you want to distinguish kin discrimination from more passive kin associations that are a byproduct of shared space use. 

What is the biggest or most surprising finding from this study? 

We provided evidence that genotyping-by-sequencing methods could produce more precise relatedness values than typical microsatellite analyses, which isn’t surprising. What was less well-understood was the effect this would have for downstream analyses, such as those testing whether relatedness correlated with social affiliation. We found that even though our study species exhibits strong, life-long affiliative relationships between maternal kin, there were a surprising number of scenarios under which our analysis failed to detect a significant correlation between genetic relatedness and social associations. We also found surprisingly diminishing returns in relatedness resolution with increasing sample size (number of individuals) when small numbers of markers were used. 

Moving forward, what are the next steps for this research? 

Pedigree reconstruction is rapidly improving, especially where there is access to new genetic resources such as chromosome-level assemblies for non-model organisms. Improved kin assignment methods will allow us to investigate the function of these relationships at the level of the individual, which will help us to tease out how both intra- and inter-specific variation in ecology and demography affect social behavior. Within my own study site, I’m using these data to look at the effect of family size on social network position and reproductive success, as well as the demographic conditions that facilitate the formation of non-kin bonds. We’re also working on ways to better discriminate between maternal and paternal kin, which will be important for investigating the mechanisms of kin recognition.

What would your message be for students about to start their first research projects in this topic? 

That this is a great idea! Rapid advances in technology will open up new avenues of inquiry and there is lots of work to be done. Nevertheless, as with any field, you also need to know when to stop and submit. There will always be a new higher coverage genome or updated version of the software you’re using that’s about to be released, but if you keep reanalyzing your data with each advance, you’ll never finish a project. My second piece of advice would be to practice simulating data and analyzing it. Building simulated datasets, tweaking parameters, and testing different software has really deepened my understanding of methodologies- plus you can start before you even get your first sequencing results and be ready with a tested pipeline when you do get results in hand.

What have you learned about science over the course of this project? 

That building a robust, reproducible, and well-documented pipeline for analysis is crucial. It might take a bit more work to set up, but it’s always worth it. I also benefitted a lot from the opportunity to present my work to audiences from different disciplines which helped me keep the big picture in mind since I’m the kind of person that gets easily caught up in minutiae. Biologically, I’m always reminded that there’s so much individual variation that gets masked by conducting analyses at the population level, and that rather than being discounted as noise, that variation could be leveraged to ask really interesting questions about how ecology and demography affect behavior. 

Describe the significance of this research for the general scientific community in one sentence.

The correlation between genetic relatedness and the strength of social relationships can be masked by the limited power of typical published sample sizes.

Describe the significance of this research for your scientific community in one sentence.

We provide practical guidance for how sample sizes and sequencing methods might interact to improve precision of relatedness estimates and their effect on the analysis of social structure, using wild bottlenose dolphins as a case study.

Three juvenile male bottlenose dolphins surface synchronously. Photo Credit: Vivienne Foroughirad

Foroughirad, V., Levengood, A. L., Mann, J., & Frère, C. H. (2019). Quality and quantity of genetic relatedness data affect the analysis of social structure. Molecular Ecology Resources, 1181–1194. https://doi.org/10.1111/1755-0998.