Interview with the author: A guide to the application of Hill numbers to DNA based diversity analyses

image
Diversity assessment procedures in traditional and DNA sequencing‐based approaches. Recorded entities need to be classified into types, before each type is weighed according to its relative abundance and the order of diversity (q). Note the example refers to an abundance‐based, rather than incidence‐based, approach

What are Hill Numbers? What do they have to do with estimating biodiversity? How can you use them as a Molecular Ecologist? Read the recent review in Molecular Ecology Resources by Antton Alberti and Thomas Gilbert on this topic, and read the interview with Antton below to learn how they think about Hill numbers and their applications to metabarcoding. Also, check hilldiv, “an R package to assist analysis of diversity for diet reconstruction, microbial community profiling or more general ecosystem characterisation analyses based on Hill numbers, using OTU tables and associated phylogenetic trees as inputs. The package includes functions for (phylo)diversity measurement, (phylo)diversity profile plotting, (phylo)diversity comparison between samples and groups, (phylo)diversity partitioning and (dis)similarity measurement. All of these grounded in abundance-based and incidence-based Hill numbers.”

What led to your interest in this topic / what was the motivation for this study? 
Measuring, estimating and contrasting biological diversity are central operations in most ecological studies. In the last decades, dozens of diversity indices and metrics have been proposed, each with their individual strengths and weaknesses, and specific mathematical assumptions. The measures that many of them yield are difficult to interpret, because the values might refer to abstract units, which lack an straightforward interpretation for non-specialists. We believe that the statistical framework developed around the Hill numbers overcomes many of these problems, and provides a statistical toolset that is extremely useful for ecologists. Besides, Hill numbers enable incorporating complementary information, such as phylogenetic dissimilarities across organisms, which are really handy for molecular ecologists who can easily build phylogenetic trees from metabarcoding data.

What difficulties did you run into along the way? 
We are a molecular ecologist and an evolutionary biologist that use many different mathematical tools, but are not expert mathematicians. Hence, of the main challenges was to make sure that all the statements and mathematical interpretations were correct!

What is the biggest or most surprising innovation highlighted in this study?
The aim of our review was to demonstrate to ecologists, who like us might have a limited mathematical background, that implementing the framework developed around the Hill numbers is not difficult, and has big potential gains. In our review we gathered information and tools generated by others, mainly Lou Jost, Anne Chao and Chun-Huo Chiu, and displayed them in a comprehensive way for molecular ecologists. We have tried to explain complex mathematical formulations in layman terms, exactly as we would like others to explain us other contents we are not familiar with. We have provided examples and pieces of code, that we hope will encourage other researches to use these tools.

Moving forward, what are the next steps in this area of research?
Our article mainly focuses on diversity measurement from data generated using DNA metabarcoding. While bioinformatic methods to generate metabarcoding data have received much attention in the last decade, the impact of the statistical approaches used to analyse diversity has been less studied. Assessing their impact and providing guidelines for selecting the tool best suited to address specific questions with specific types of data, will be an important next step in the area of metabarcoding-based diversity analyses.

What would your message be for students about to start developing or using novel techniques in Molecular Ecology? 
Despite the fact that they might at first seem complex and abstract, bioinformatic and statistical tools are necessary to address ecological questions. Hence, we would encourage students to try to understand the basic bioinformatic and statistical procedures, so as to be able to select the best tools to address their research questions.

image
Differences between abundance‐based and incidence‐based Hill numbers. The Hill numbers yielded for the entire system are different depending on the approach employed. In abundance‐based approaches, the DNA sequence is the unit that the diversity is computed on, while in incidence‐based approaches, it is the sample the unit upon which the diversity is measured. (*) The asterisk indicates that the equations are undefined for q = 1, thus in practice either the 1D formula shown in Table 1 or a limit of the unity must be used, for example, q = 0.9999. However, q = 1 is used for the sake of simplicity

What have you learned about methods and resources development over the course of this project?
That its not the most broadly-employed tools that are always the best way to address scientific questions!

Describe the significance of this research for your scientific community in one sentence.
Hill numbers provide powerful, solid and versatile tools with which to carry out most of the analyses that are needed to assess biological diversity within a common statistical framework.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s