Browsing by Subject "RNA structure"

Now showing 1 - 10 of 10

The Ascent of the Abundant: How Mutational Networks Constrain Evolution
(Public Library of Science, 2008-07-18) Cowperthwaite, Matthew C.; Economo, Evan P.; Harcombe, William R.; Miller, Eric L.; Meyers, Lauren Ancel
Evolution by natural selection is fundamentally shaped by the fitness landscapes in which it occurs. Yet fitness landscapes are vast and complex, and thus we know relatively little about the long-range constraints they impose on evolutionary dynamics. Here, we exhaustively survey the structural landscapes of RNA molecules of lengths 12 to 18 nucleotides, and develop a network model to describe the relationship between sequence and structure. We find that phenotype abundance—the number of genotypes producing a particular phenotype—varies in a predictable manner and critically influences evolutionary dynamics. A study of naturally occurring functional RNA molecules using a new structural statistic suggests that these molecules are biased toward abundant phenotypes. This supports an “ascent of the abundant” hypothesis, in which evolution yields abundant phenotypes even when they are not the most fit.
Helix Capping in RNA Structure
(PLOS One, 2014-04-01) Lee, Jung C.; Gutell, Robin R.
Helices are an essential element in defining the three-dimensional architecture of structured RNAs. While internal basepairs in a canonical helix stack on both sides, the ends of the helix stack on only one side and are exposed to the loop side, thus susceptible to fraying unless they are protected. While coaxial stacking has long been known to stabilize helix ends by directly stacking two canonical helices coaxially, based on analysis of helix-loop junctions in RNA crystal structures, herein we describe helix capping, topological stacking of a helix end with a basepair or an unpaired nucleotide from the loop side, which in turn protects helix ends. Beyond the topological protection of helix ends against fraying, helix capping should confer greater stability onto the resulting composite helices. Our analysis also reveals that this general motif is associated with the formation of tertiary structure interactions. Greater knowledge about the dynamics at the helix-junctions in the secondary structure should enhance the prediction of RNA secondary structure with a richer set of energetic rules and help better understand the folding of a secondary structure into its three-dimensional structure. These together suggest that helix capping likely play a fundamental role in driving RNA folding.
Improving the prediction of RNA secondary structure and automatic alignment of RNa sequences
(2012-05) Gardner, David Paul; Gutell, Robin; Ren, Pengyu; Browning, Karen; Russell, Rick; Makarov, Dmitrii E.; Miranker, Daniel
The accurate prediction of an RNA secondary structure from its sequence will enhance the experimental design and interpretation for the increasing number of scientists that study RNA. While the computer programs that make these predictions have improved, additional improvements are necessary, in particular for larger RNAs. The first major section of this dissertation is concerned with improving the prediction accuracy of RNA secondary structures by generating new energetic parameters and evaluating a new RNA folding model. Statistical potentials for hairpin and internal loops produce significantly higher prediction accuracy when compared with nine other folding programs. While more improvements can be made to the energetic parameters used by secondary structure folding programs, I believe that a new approach is also necessary. I describe a RNA folding model that is predicated on a large body of computational and experimental work. This model includes energetics, contact distance, competition and a folding pathway. Each component of this folding model is evaluated and substantiated for its validity. The statistical potentials were created with comparative analysis. Comparative analysis requires the creation of highly accurate multiple RNA sequence alignments. The second major section of this dissertation is focused on my template-based sequence aligner, CRWAlign. Multiple sequence aligners generally run into problems when the pairwise sequence identity drops too low. By utilizing multiple dimensions of data to establish a profile for each position in a template alignment, CRWAlign is able to align new sequences with high accuracy even for pairs of sequence with low identity.
Insights into RNA design from novel molecular tools
(2017-01-09) Vazquez Anderson, Alberto Jorge; Contreras, Lydia M.; Russell, Rick; Alper, Hal; Ren, Pengyu; Georgiou, George
RNA, previously recognized merely as a messenger of genetic information, has been recently rediscovered as a versatile molecule with a central role in cellular regulation. These regulatory functions are enabled by its specific chemical makeup that allows it to fold into intricate and flexible structures. In stark contrast with DNA, RNA forms a variety of structural motifs that serve as efficient points of contact in molecular recognition. It is therefore clear, that dynamic RNA structures dictate the binding availability of interfaces that play important roles in molecular regulation inside living cells. As such, the need for tools that can accurately capture and predict RNA structure in vivo continues to be essential to understand RNA function. To this end, my dissertation focuses on the development of molecular tools to predict and characterize accessible RNA interfaces in their native environment. First, I established the usefulness of a fluorescence-based in vivo oligonucleotide hybridization approach to identify accessible interfaces by characterizing numerous RNA regions in several biologically relevant molecules in E. coli. I then described these RNA interactions using a biophysical model based on thermodynamic principles and incorporating large sets of data collected using this fluorescence-based system. This approach displayed improved prediction capabilities of RNA accessibility compared to un-optimized versions without incorporation of in vivo data. Finally, I detailed the development and application of a high throughput tool for the large-scale characterization of accessible interfaces within native RNAs in a single experiment. In this approach, in vivo oligonucleotide hybridization was coupled to transcriptional elongation control to allow analysis via next generation sequencing. This tool was used to obtain complete landscapes of functional structure for 72 regulatory molecules in a single experiment (>1000 regions). Altogether the results of this high throughput approach revealed a pattern indicating that RNA-RNA interaction sites are either highly accessible or highly protected, suggesting their binding status (e.g. actively bound or unbound). In addition, within bacterial small RNAs, our approached revealed the role of the global regulator Hfq as universal structural relaxer. The compendium of these tools provides a unique and fundamental perspective in the study of functional RNA structure, namely, the identification of dynamic structures. Furthermore, the information provided by these approaches significantly aids in the design of synthetic RNAs for a variety of purposes, including gene expression control.
Matrix and tensor decomposition methods as tools to understanding sequence-structure relationships in sequence alignments
(2010-12) Muralidhara, Chaitanya; Alter, Orly, 1964-; Gutell, Robin
We describe the use of a tensor mode-1 higher-order singular value decomposition (HOSVD) in the analyses of alignments of 16S and 23S ribosomal RNA (rRNA) sequences, each encoded in a cuboid of frequencies of nucleotides across positions and organisms. This mode-1 HOSVD separates the data cuboids into combinations of patterns of nucleotide frequency variation across the positions and organisms, i.e., "eigenorganisms"' and corresponding nucleotide-specific segments of "eigenpositions," respectively, independent of a-priori knowledge of the taxonomic groups and their relationships, or the rRNA structures. We show that this mode-1 HOSVD provides a mathematical framework for modeling the sequence alignments where the mathematical variables, i.e., the significant eigenpositions and eigenorganisms, are consistent with current biological understanding of the 16S and 23S rRNAs. First, the significant eigenpositions identify multiple relations of similarity and dissimilarity among the taxonomic groups, some known and some previously unknown. Second, the corresponding eigenorganisms identify positions of nucleotides exclusively conserved within the corresponding taxonomic groups, but not among them, that map out entire substructures inserted or deleted within one taxonomic group relative to another. These positions are also enriched in adenosines that are unpaired in the rRNA secondary structure, the majority of which participate in tertiary structure interactions, and some also map to the same substructures. This demonstrates that an organism's evolutionary pathway is correlated and possibly also causally coordinated with insertions or deletions of entire rRNA substructures and unpaired adenosines, i.e., structural motifs which are involved in rRNA folding and function. Third, this mode-1 HOSVD reveals two previously unknown subgenic relationships of convergence and divergence between the Archaea and Microsporidia, that might correspond to two evolutionary pathways, in both the 16S and 23S rRNA alignments. This demonstrates that even on the level of a single rRNA molecule, an organism's evolutionary pathway is composed of different types of changes in structure in reaction to multiple concurrent evolutionary forces.
Mobile group II intron : host factors, directed evolution, and gene targeting in human cells
(2014-05) Truong, David Minh; Lambowitz, Alan
Mobile group II introns are retroelements that are found in prokaryotes, archaea, and the organelles of plants and fungi, but not in the nuclear genomes of eukaryotes. They consist of a catalytically active RNA and intron-encoded reverse transcriptase, which together promote site-specific integration into DNA sites in a mechanism called retrohoming. The group II intron Ll.LtrB has been developed into a programmable, DNA-targeting agent called "targetron", which is widely used in bacteria and an attractive technology for gene targeting in eukaryotes. However, group II intron genome targeting in human cells has not been equivocally shown. This dissertation focuses on the hypothesis that the low Mg2+-concentrations found in higher eukaryotes present a natural barrier to group II introns. First, I studied E. coli host proteins that aid group II intron retrohoming and found that synthesis of a second DNA-strand relies on host replication restart proteins. Next, I demonstrated that mutations in the distal stem of the catalytic core domain V (DV) improve Ll.LtrB retrohoming in a low Mg2+-concentration E. coli mutant and in biochemical assays. These results suggest that DV is involved in an RNA-folding step that becomes rate limiting at low Mg2+. Subsequently, I performed directed evolution of the intron RNA by injecting in vitro prepared mutant intron libraries into Xenopus laevis oocyte nuclei. The mutations were analyzed using Roche 454 sequencing to generate an intron fitness landscape, which revealed conserved positions and potentially beneficial mutations, enabling enhanced retrohoming in Xenopus oocytes. Finally, I used a hybrid Pol II/T7 Ll.LtrB eukaryotic expression system to show that high exogenous MgCl2 in the growth media enables retrohoming into plasmids and genomic DNA in human cells. In vivo directed evolution and mutation analyses using PacBio RS circular consensus sequencing indicated that only a few mutations may improve intron activity in human cells. This dissertation provides evidence that efficient group II intron retrohoming in human cells is limited by low Mg2+-concentrations and develops new approaches for overcoming this limitation to enable use of group II introns for gene targeting in higher organisms.
Modeling RNA, protein, and synthetic molecules using coarse-grained and all-atom representations
(2016-12-05) Bell, David Russell; Ren, Pengyu; Elber, Ron; Stachowiak, Jeanne; Behar, Marcelo
The aim of computational chemistry is to depict and understand the dynamics and interactions of molecular systems. In addition to increased comprehension in the physical and life sciences, this insight yields important applications to therapeutic design and materials science. In computational chemistry, molecules can be modeled in a number of representations depending on the molecular system and phenomena of interest. In this work, both simplified, coarse-grained representations and all-atom representations are used to model the interactions of RNA, cucurbituril host-guest chemistry, and cadmium selenide quantum dot binding to the Src homology 3 domain. For RNA, a coarse-grained model was developed termed RACER (RnA CoarsE-gRained) to accurately predict RNA structure and folding free energy. After optimization to statistical potentials, RACER accurately predicted the structures of 14 RNAs with an average 4.15Å root mean square deviation (RMSD) to the experimental structure. Further, RACER captured the sequence-specific variation in folding free energy for a set of 6 RNA hairpins and 5 RNA duplexes, with a R² correlation of 0.96 to experiment. The binding free energies of a cucurbituril host with 14 guests were computed using a polarizable force field and the free energy techniques of Bennett acceptance ratio and the orthogonal space random walk. The polarizable force field captured binding accurately, yet unexpectedly, the orthogonal space random walk method converged slowly, albeit at still reduced computational expense to the Bennett acceptance ratio. Lastly, the nanotoxicity effects of trioctylphosphine oxide coated cadmium selenide quantum dots are investigated with the model Src homology 3 protein domain in complex with its native proline rich motif ligand. With increasing quantum dot concentration, there is an increasing preference for the quantum dots to bind to the proline rich motif active site, inhibiting Src homology 3 function.
Probing stability, specificity, and modular structure in group I intron RNAs
(2010-12) Wan, Yaqi; Russell, Rick, 1969-; Browning, Karen S.; Gutell, Robin; Hoffman, David; Ren, Pengyu
Many functional RNAs are required to fold into specific three-dimensional structures. A fundamental property of RNA is that its secondary structure and even some tertiary contacts are highly stable, which gives rise to independent modular RNA motifs and makes RNAs prone to adopting misfolded intermediates. Consequently, in addition to stabilizing the native structure relative to the unfolded species (defined here as stability), RNAs are faced with the challenge of stabilizing the native structure relative to alternative structures (defined as structural specificity). How RNAs have evolved to overcome these challenges is incompletely understood. Self-splicing group I introns have been used to study RNA structure and folding for decades. Among them, the Tetrahymena intron was the first discovered and has been studied extensively. In this work, we found that a version of the intron that was generated by in vitro selection for enhanced stability also displayed enhanced specificity against a stable misfolded structure that is globally similar to the native state, despite the absence of selective pressure to increase the energy gap between these structures. Further dissection suggests that the increased specificity against misfolding arises from two point mutations, which strengthen a local tertiary contact network that apparently cannot form in the misfolded conformation. Our results suggest that the structural rigidity and intricate networks of contacts inherent to structured RNAs can allow them to evolve exquisite structural specificity without explicit negative selection, even against closely-related alternative structures. To explore further how RNAs gain stability from intricate architectures, we examined a novel group I intron from red algae (Bangia). Biochemical methods and computational modeling suggest that this intron possesses general motifs of group IC1 introns but also forms an atypical tertiary contact, which has been reported previously in other subgroups and helps position the reactive helix at the active site. In the Bangia intron, the partners have been swapped relative to known group I RNAs that include this contact. This result underscores the modular nature of RNA motifs and provides insight into how structured RNAs can arrange helices and contacts in multiple ways to achieve and stabilize functional structures.
Quantitative dissection of RNA structure formation reveals a cooperative and modular folding and assembly landscape
(2017-06-19) Gracia, Brant Richard; Russell, Rick, 1969-; Johnson, Kenneth A; Finkelstein, Ilya; Matouschek, Andreas; Contreras, Lydia
Structured RNAs are pervasive in biology with ubiquitous roles in gene expression and regulation. RNAs must fold from a linear chain of nucleotide sequence to attain three-dimensional structure. RNA folding can be described as modular and hierarchical with tiers of structure that form independently: secondary structure forms first and defines the helices followed by formation of tertiary structure. The separation between secondary and tertiary structure is not absolute. Many biological RNAs couple secondary structure changes to RNA tertiary structure formation and link these changes to downstream functional consequences. To predict how these biological RNAs fold requires a deep understanding of the structural intermediates, folding pathways, and mechanisms of cooperativity that promote folding. To test the modularity and predictability of secondary and tertiary RNA folding and assembly, we have investigated the folding and assembly of the P5abc subdomain from the Tetrahymena thermophila Group I intron ribozyme. P5abc folds cooperatively in isolation, binding Mg²⁺ ions and adopting tertiary structure. Mg²⁺ binding is linked to a shift in the secondary structure of seventeen nucleotides and prior work concluded that there is a high degree of cooperativity for this seemingly concerted transition. With the already established principles of RNA modularity in mind, we develop a reconstitution hypothesis to test if cooperative secondary and tertiary folding and assembly of P5abc can be understood from the component pieces. By using rational mutagenesis, we find that higher order folding of P5abc is modular, and we elucidate the physical origins of cooperativity (Chapter 2). With our knowledge of isolated P5abc folding, we demonstrate that the local folding transition within P5abc controls the rate and pathway of assembly with the P5abc-deleted ribozyme core (E[superscript ΔP5abc]), further highlighting the modularity of RNA structure (Chapter 3). Lastly, we show that the kinetics of assembly can be attributed to specific tertiary contacts that form in the assembly transition state such that the rate of a particular folding pathway is dictated by the properties of an individual tertiary contact (Chapter 4). The modularity of RNA structure makes it a reasonable molecule for the origins of life and an adaptable tool for bioengineering applications.
Structural Constraints Identified with Covariation Analysis in Ribosomal RNA
(Public Library of Science, 2012-06-19) Shang, Lei; Xu, Weijia; Ozer, Stuart; Gutell, Robin R.
"Covariation analysis is used to identify those positions with similar patterns of sequence variation in an alignment of RNA sequences. These constraints on the evolution of two positions are usually associated with a base pair in a helix. While mutual information (MI) has been used to accurately predict an RNA secondary structure and a few of its tertiary interactions, early studies revealed that phylogenetic event counting methods are more sensitive and provide extra confidence in the prediction of base pairs. We developed a novel and powerful phylogenetic events counting method (PEC) for quantifying positional covariation with the Gutell lab’s new RNA Comparative Analysis Database (rCAD). The PEC and MI-based methods each identify unique base pairs, and jointly identify many other base pairs. In total, both methods in combination with an N-best and helix-extension strategy identify the maximal number of base pairs. While covariation methods have effectively and accurately predicted RNAs secondary structure, only a few tertiary structure base pairs have been identified. Analysis presented herein and at the Gutell lab’s Comparative RNA Web (CRW) Site reveal that the majority of these latter base pairs do not covary with one another. However, covariation analysis does reveal a weaker although significant covariation between sets of nucleotides that are in proximity in the three-dimensional RNA structure. This reveals that covariation analysis identifies other types of structural constraints beyond the two nucleotides that form a base pair.