Genome Majority Vote Improves Gene Predictions
MetadataShow full item record
Recent studies have noted extensive inconsistencies in gene start sites among orthologous genes in related microbial genomes. Here we provide the first documented evidence that imposing gene start consistency improves the accuracy of gene start-site prediction. We applied an algorithm using a genome majority vote (GMV) scheme to increase the consistency of gene starts among orthologs. We used a set of validated Escherichia coli genes as a standard to quantify accuracy. Results showed that the GMV algorithm can correct hundreds of gene prediction errors in sets of five or ten genomes while introducing few errors. Using a conservative calculation, we project that GMV would resolve many inconsistencies and errors in publicly available microbial gene maps. Our simple and logical solution provides a notable advance toward accurate gene maps.
Michael E. Wall is with Los Alamos National Laboratory, Sindhu Raghavan is with UT Austin and Los Alamos National Laboratory, Judith D. Cohn is with Los Alamos National Laboratory, John Dunbar is with Los Alamos National Laboratory.
CitationWall ME, Raghavan S, Cohn JD, Dunbar J (2011) Genome Majority Vote Improves Gene Predictions. PLoS Comput Biol 7(11): e1002284. doi:10.1371/journal.pcbi.1002284
The following license files are associated with this item:
Showing items related by title, author, creator and subject.
Vogel, Christine; Chothia, Cyrus (Public Library of Science, 2006-05-26)During the course of evolution, new proteins are produced very largely as the result of gene duplication, divergence and, in many cases, combination. This means that proteins or protein domains belong to families or, in ...
A Universal Trend of Reduced mRNA Stability near the Translation-Initiation Site in Prokaryotes and Eukaryotes Gu, Wanjun; Zhou, Tong; Wilke, Claus O. (Public Library of Science, 2010-02-05)Recent studies have suggested that the thermodynamic stability of mRNA secondary structure near the start codon can regulate translation efficiency in Escherichia coli, and that translation is more efficient the less stable ...
Zhang, Jin, doctor of plant biology (2015-08)Plastid genomes of angiosperms are highly conserved in both genome organization and nucleotide substitution rates. Geraniaceae have highly rearranged genomes and elevated nucleotide substitution rates, which provides an ...