The largest set of human genomes sequenced from a single population is reported in four linked papers published online this week in Nature Genetics. These data provide a comprehensive resource for the study of human disease and genetic diversity.
Kari Stefansson and colleagues sequenced the genomes of 2,636 Icelanders. They identified more than 20 million genetic variants that can be used, together with national healthcare information and extensive genealogical records, to better understand the genetic basis of many diseases. To demonstrate the usefulness of this resource, the authors combined the whole genome sequence data with less extensive genotype data from over 104,000 additional Icelanders to strengthen their association tests and identified a number of variants associated with a variety of diseases. In one such test of the power of these data, the researchers identified a variant of the gene ABCB4 that is significantly associated with the risk of developing liver disease.
In a separate study using these data, Stefansson and colleagues identified mutations in the ABCA7 gene that are associated with an increased risk of Alzheimer’s disease. Six of the eight mutations in ABCA7 were also found to be present in other populations of European ancestry, including the United States, indicating that the results are not specific to the Icelandic population.
In a third study, Patrick Sulem and colleagues mined these data to identify over 8,000 Icelandic individuals that have completely lost the function of at least one gene. In all, they identified 1,171 of these rare genetic “knockouts". Olfactory receptor genes, which are responsible for our ability to discriminate between different smells, were the most commonly knocked out class of genes, while genes expressed highly in the brain were rarely knocked out, suggesting their loss would be more harmful. Further mining of these data will help scientists understand which genes are indispensable and which are linked to disease.
The fourth linked study, by Agnar Helgason and colleagues, used whole-genome sequence data from 753 Icelandic men from 274 groups of related individuals to estimate the rate of mutations on the Y-chromosome. The results represent the most accurate estimate of the male-specific mutation rate to date. The study dates the most recent common ancestor (MRCA) for human Y chromosomes to 174,000-321,000 years ago. This estimate is much closer to that of the MRCA for mitochondrial DNA, which is inherited exclusively from the mother, and may have implications for understanding the evolution of our species.
Genetic and genomic variation data pertaining to the paper Sequence variants from whole genome sequencing a large group of Icelanders are described in an accompanying Scientific Data article, along with important methodological details and other information to facilitate the use these data by other researchers.