A collection of the most complete and highest quality genomes for 16 vertebrate species reported to date is presented in a paper in Nature this week. The study is part of a collection of papers being published in Nature and Nature Communications from the Vertebrate Genomes Project, which aims to assemble high-quality genomes for all known living vertebrate species. These genomes could help to address fundamental questions in biology, medicine and biodiversity conservation.
Reference genomes offer insights into the functions of genomes and allow comparisons between species. However, there are some gaps in our knowledge, because first-generation sequencing and assembly techniques were costly, labour-intensive and slow, and the following second-generation short-read sequencing technologies (while being cheaper and faster) generated more fragmented assemblies owing to the use of shorter sequencing reads that made it hard to correctly piece together the genome. Technological developments, including — but not limited to — the availability of long-read sequencing technologies, have made it possible to overcome these issues. An overview paper in Nature describes the evaluation of approaches for assembling highly accurate and nearly complete reference genomes, and their application to a select group of species that represent various orders of vertebrate species.
Multiple genome sequencing and assembly approaches were first evaluated in one species, the Anna’s hummingbird, Erich Jarvis and colleagues report. They then applied the best-performing method to a further 15 species that represent the major vertebrate classes, including mammals and birds (such as the platypus and zebra finch, respectively) and reptiles, amphibians and fish. The optimized approach confirms that longer sequence reads maximize genome quality and the resulting assemblies correct substantial errors that were seen in earlier reference genomes. The improved genomes reveal genes and even whole chromosomes that were missing from previous references. These findings offer new insights into genome evolution.
Moving forward, the Vertebrate Genomes Project approach described here will continue to be optimized. The ultimate goal is to produce at least one high-quality, near error-free and gapless reference genome for each of the 71,657 known living vertebrate species.
After the embargo ends, the overview paper will be available at: https://www.nature.com/articles/s41586-021-03451-0
The landing page for the collection will be: https://www.nature.com/articles/d42859-021-00001-6
Paleontology: New species of giant rhino discovered from 26.5-million-year-old fossilsCommunications Biology
Health: Hand-held device could reduce fatigue through electrical stimulationCommunications Biology