Different computational tools for analysing sequencing data from a cell’s entire repertoire of RNA are compared in two related studies published online this week in Nature Methods. While the DNA of every cell in an organism is the same, the parts of the genome that are transcribed into RNA differ between cell types. This diversity is largely responsible for cells' different functions and is therefore important to understand.
High throughput sequencing of all of a cell’s RNA, called transcriptomes, using a method known as RNA-seq, has been instrumental in understanding the role of many genes, but reconstructing entire transcripts, often many kilobases in length, from short 75 base pair sequence reads is a challenge requiring advanced computational tools.
A consortium led by Paul Bertone and colleagues compared over twenty state-of-the-art computational methods for transcriptome analysis for their abilities to carry out two important steps in the RNA-seq analysis process: the first paper investigates which tool best maps stretches of sequence to a reference genome, and the other paper looks at tools to reconstruct transcripts from those mapped sequences. Both papers highlight the strength of current computational tools but also the weaknesses and areas in need of improvement. Most of the transcript reconstruction tools, for example, performed well in reassembling parts of a transcript, but none could accurately recapitulate the entire RNA. The authors provide metrics that will be helpful for benchmarking future tools.