RNA secondary structure prediction and an expert systems methodology for RNA comparative analysis in the genomic era
MetadataShow full item record
The ability of certain RNAs to fold into complicated secondary and tertiary structures provides them with the ability to perform a variety of functions in the cell. Since the secondary and tertiary structures formed by certain RNAs in the cell are central to understanding how they function, one of the most active areas of research has been how to accurately and reliably predict RNA secondary structure from sequence; better known as the RNA Folding Problem. This dissertation examines two fundamental areas of research in RNA structure prediction, free energy minimization and comparative analysis. The most popular RNA secondary structure prediction program, Mfold 3.1 predicts RNA secondary structure via free energy minimization using experimentally determined energy parameters. I present an evaluation of the accuracy of Mfold 3.1 using the largest set of phylogenetically diverse, comparatively predicted RNA secondary structures available. This evaluation will show that despite significant revisions to the energy parameters, the prediction accuracy of Mfold 3.1 is not significantly improved when compared to previous versions. In contrast, RNA comparative analysis has repeatedly demonstrated the ability to accurately and reliably predict RNA secondary structure. The downside is that RNA comparative analysis frequently requires an expert systems methodology which is predominately manual in nature. As a result, RNA comparative analysis is not capable of scaling adequately to be useful in the genomic era. Therefore, I developed the Comparative Analysis Toolkit (CAT) which is intended to be the fundamental component of a vertically integrated software infrastructure to facilitate high-throughput RNA comparative analysis using an expert systems methodology.