In science news this month, a paper in BMC Biology is reporting (Nozaki et al) the sequence of "the first nuclear-genome sequence for any eukaryote that is 100% complete." This might come as a surprise to many scientists, even genomics experts.
The new genome is the red alga Cyanidioschyzon merolae, and it is just 16,546,747 nucleotides long, including all 20 chromosomes from telomere to telomere. The genome had been published previously, but it had 46 internal gaps (totaling 46,469 nt) and the telomeres were missing. They also discovered that about 20 kilobases were mis-assembled previously, a common problem that I've written about elsewhere (see my editorial with Jim Yorke, "Beware of Misassembled Genomes", available on my home page.)
But wait a minute, you might ask (as I did). What about the yeast genome (S. cerevisiae), originally published in 1996 as the first eukaryotic genome? I thought that was finished some time ago. It's true that there have been many published corrections since 1996, but I know the telomeres are present on most (all?) of the chromosomes. And how about the nematode C. elegans - it was published in 1998 while still incomplete, but about four years later it was announced as complete (see the link). These papers are cited by the new paper, but oddly, it doesn't explain what is missing from these earlier "complete" genomes. And I think we finally finished the malaria parasite, Plasmodium falciparum, although the original paper (which I was a part of) appeared in 2000, before all the gaps were closed.
Of course, most genomicists know that the human genome is still far from complete - all the telomeres and centromeres are missing, and there are several hundred other gaps - but I am a bit skeptical of the claim here that the red alga C. merolae is the first complete eukaryote. Can anyone out there tell me why I'm wrong?
It is nice that this paper brings up the issue of completeness for eukaryotic genomes (e.g., how many times have they announced the completion of the human genome). But I agree with you - the claims in this paper are not backed up by discussion or citation. But I am not so sure that they are wrong - maybe we can get some others to comment on this.
ReplyDeleteWell, I've just downloaded the latest version of the yeast genome, S. cerevisiae, and it appears that most but not all of the chromosomes have telomeric sequence. So I would have to say that the yeast genome is not 100% complete, despite what I (and most other people, probably) had thought.
ReplyDeleteI haven't checked the C. elegans claim, though - anyone want to look at that? They announced it was "complete" five years ago now. Plus there have been other small eukaryotes (not much bigger than bacteria) sequenced in the meantime - anyone know if any of those are complete?
Thanks for the news, the genomes I thought were complete, are actually incomplete!
ReplyDeleteNot sure when http://research.medicine.wustl.edu/OCFR/Research.nsf/Abstracts/0761B7D48C45D98786256FA500717597?OpenDocument&VW=Cell+Biology+and+Regulation was published, but it seems like Prof. Richard K. Wilson is saying that C.elegans is incomplete too. The wikipedia says "the WS159 release of May 2006 added over 300 bp to the sequence [ http://www.wormbase.org/wiki/index.php/WS159 ]".
I wonder about yeast, http://www.yeastgenome.org/cache/genomeSnapshot.html#ChrSeqAnnotUpdates talks about categories such as "Dubious ORFs"... Now what are these "Total Length #" in http://www.yeastgenome.org/chromosomes.shtml ?
I think they mean the first fully autotrophic eukaryote
ReplyDelete