In a recent paper in Molecular Ecology, Caballero-López and colleagues investigated the genetics of migratory behaviour in a two subspecies of willow warbler (Phylloscopus trochilus trochilus and Phylloscopus trochilus acredula). Previous work had identified several genetic markers associated with migratory behaviour in this species, but a particularly important candidate marker was unable to be mapped to previous genome assemblies. This suggested to Caballero-López et al, that the important marker may lie in a highly repetitive, and thus difficult to assemble, genomic region. Leveraging a recent genome assembly based on long-read technology and a quantitative PCR approach, Caballero-López et al found that the elusive migration marker is located in a genomic region rich in remnants of transposable elements.
We sent some questions to the primary author of this work, Violeta Caballero-López, to get some more insight and details about this exciting study.
What led to your interest in this topic / what was the motivation for this study?
My research aims to shed some light on our understanding of the genetics underpinning bird migration, which is currently very poor. Passerine birds migrate alone, and they follow the same routes to wintering grounds as their parents, fully relying on genetic mechanisms.
The motivation for this specific study was to try to characterize a region in the genome which varies between two subspecies of willow warbler that present differential migration to Africa. Until now, this region was only identified as an AFLP-derived marker which failed to be mapped to the genome. However, with the use of molecular techniques such as qPCR in combination with a good quality genome assembly, we could understand the nature of this element better.
AFLP: Amplified Fragment Length Polymorphism
qPCR: Quantitative Polymerase Chain Reaction
Can you describe the significance of this research for the general scientific community in one sentence?
Repeat-rich regions which are often considered “junk DNA” might have a larger role on phenotypes and function than previously thought.
Can you describe the significance of this research for your scientific community in one sentence?
It is important to revise the role of repeat DNA on the determination of a complex trait such as the determination of bird migratory routes.
What difficulties did you run into along the way?
For more than 20 years “WW2” has been an elusive AFLP marker, observed to be fixed in the “northern” subspecies P. t. acredula. It could only be amplified in PCR as a 154 bp fragment and then sequenced, but its nature was totally unknown. The identification and curation of this sequence as a transposon (TE) was challenging because it is an old, degraded “LTR portion” of the full element. This required a willow warbler genome built with long read sequencing techniques that provided regions of the genome rich in repeat DNA. Locating the ends of this transposon was also complicated. Alignment “breaks” serve as a detection method for the target site duplications that mark the edge of these elements. However, they could not be used in our system because these TEs appear consistently embedded within a larger block of repeats. This interfered with our estimation of age and theories about the origin of the repeat.
LTR: Long terminal repeat
What is the biggest or most surprising innovation highlighted in this study?
The most surprising finding here is the presence of a large repeat-rich region (>12 mb) that segregates in both willow warbler subspecies. This region is characterized by several copies of the WW2 derived variant, which turned out to be part of a transposable element belonging to the endogenous retrovirus family. Furthermore, we provide solid evidence of its independence from the other polymorphic regions in chromosomes 1 and 5. As this TE seems to be inactive, and no clear functional genes have been detected on its surroundings, it remains puzzling why this region correlates with migration in the willow warbler so strongly.
You end your paper describing how it’s premature to think that the association of the WW2 derived variant has a causal role on the trait. Based on your knowledge of the warbler genome, would you care to speculate as to the actual causal basis of the phenotype?
The most supported hypothesis is that migration is a complex trait influenced by gene packages. In the case of the willow warblers, I would speculate that the repeat rich region, and not necessarily the WW2 derived variant itself, could affect migration indirectly through 1) the formation of a structural variant in a chromosome that affects gene expression 2) the trans-regulation from this region of some gene(s) elsewhere in the genome 3) the presence of an adjacent gene outside this region that we have not been able to detect in the current genome assemblies so far 4) a missed single copy gene within the repeat rich region. However, the last one is the least likely given that areas with such a repeat density rarely contain functional genes.
Have you got any ideas of how you might test the hypothesis that chromosomal rearrangements were facilitated by the presence of TEs?
The most exciting possibility is to visually confirm if these rearrangements have taken place. A way to test this empirically would be to obtain a karyotype of each subspecies and combine it with fluorescence in situ hybridization (FISH). First, a probe labelling the WW2 derived variants would signal the location of the repeat-rich region. Once the location of this region is resolved, it is possible to design several fluorescent probes outside of it to determine if the chromosomal arrangement around it is maintained both in the genome of P. t. acredula and its orthologue region in P. t. trochilus.
Moving forward, what are the next steps in this area of research?
The biggest mystery within this study is the location in the genome of this repeat-rich region that contains several copies of the WW2 derived variant. One of the biggest challenges of genome assemblies is the mapping and correct location of repeat-dense sequences, and therefore future effort should be focused on targeting empirical evidence of the location of this region. Then we could get a better hint on if and/or how this region affects migration. Is it downstream or upstream of any gene complex? Is it silenced? how does its orthologue look in P. t. trochilus?
Caballero-López, V., Lundberg, M., Sokolovskis, K., & Bensch, S. (2022). Transposable elements mark a repeat-rich region associated with migratory phenotypes of willow warblers (Phylloscopus trochilus). Molecular Ecology, 31, 1128– 1141.