PyroNoise, an algorithm that accurately determines microbial diversity in an ecological sample from high-throughput pyrosequencing data, is presented online in this week's Nature Methods.
During high-throughput pyrosequencing each base in a sequence is read as a flash of light and intensities of these light flashes are later converted to the four bases. It has been a valuable tool in assessing microbial diversity, but the challenge is to distinguish sequencing errors from true diversity. Sequences classified as comprising a new species may simply represent a known species with a few sequencing errors. This has led to an overestimation of microbial diversity and distorted phylogenetic trees constructed even when standard error correction methods ― such as removal of ambiguous reads ― were applied.
With PyroNoise, Christopher Quince and colleagues introduce an algorithmic solution for error correction. PyroNoise goes back to the primary data output ― the light intensities that represent each base ― and accurately calculates the correct sequence from these light intensities. This largely removes errors in the data and allows an accurate calculation of diversity within a sample.