Why Do Only Transcriptomes? Sequencing transcripts (i.e. expressed genes) is inherently cheaper than sequencing genomes, because it obviates the need to sequence the intronic and intergenic regions, which can be orders of magnitude larger. Obviously one can never get all the genes just by doing transcripts, and it is not our intention to argue for one or the other, since (a) the pros and cons were laid out two decades ago for the human genome project, and (b) BGI-Shenzhen is also sequencingde novo genomes like the giant panda with the new technology. The thing is once you get away from the few dozen obviously important plant species, almost none of the roughly half million plant species known to humanity has been touched by genomics at any level. However, we are emphatically not generating ESTs (i.e.expressed sequence tags), which commit only a single read to each transcript. We are making shotgun libraries and trying to reconstruct full-length transcripts by computationally assembling the fragments. Based on preliminary experiments with 1 Gb of raw sequence per species we expect the first few thousand of the most abundantly expressed genes to be full-length. For the next ten thousand genes, we still recovered more of the coding region than traditional ESTs that prime off the poly-A tail and capture mostly the 3-UTR. That said, thanks to continuing technology improvements, we expect that most of the project will generate 2 Gb of raw sequence per species.
Pages to are hidden for
"Why Do Only Transcriptomes.doc"Please download to view full document