A method to produce error-free long DNA sequence reads of many kilobases is reported this week in Nature Methods. The workflow uses Single Molecule, Real-Time (SMRT) sequencing technology and simplifies the process of assembling bacterial genomes without a reference sequence to better understand their roles in ecology and pathology and trace their evolution.
Bacterial genomes are usually stitched together from short DNA sequences generated by next-generation sequencing machines, but this leaves gaps that are laborious to fill. Recent hybrid approaches combine different sequencing technologies and fill the gaps left by short-read assembly, such as those created with common next-generation sequencers, with information from longer, albeit less accurate, reads created by SMRT sequencing. This strategy requires multiple sequencing libraries and sequencing machines.
Long reads have always been highly desired for many sequencing applications, but error rates have limited their utility. Jonas Korlach and colleagues’ strategy corrects errors in long reads using shorter reads from the same library, enabling very high-quality, gap-free assembly of bacterial genomes and bacterial artificial chromosomes in a fully automated pipeline. The approach also handles difficult repetitive sequences effectively.