Investigations in integrative and molecular bioscience
Access full-text files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Modern biology is going through a revolution of new methods and insights resulting from the new availability of high-throughput DNA sequencing technology. I here present work contributing mathematical and computational methods for gaining insight from large DNA sequencing data sets at three distinct levels. First, I present a method for improving the accuracy and efficiency of DNA barcodes, short sequences of DNA used to label individual molecules in pooled samples. Many DNA sequencing applications depend on the use of DNA barcodes. However, errors in DNA synthesis and sequencing—substitutions, insertions, and deletions—confound the correct interpretation of these barcodes. I here present Filled/truncated Right End Edit (FREE) barcodes designed for barcode error-correction in the context of a downstream sequence. Second, I present the Chip-Hybridized Affinity Mapping Platform (CHAMP), a novel technology for repurposing used DNA sequencing chips to study the mechanism and sequence preferences of DNA-binding proteins. Since 2012, the CRISPR family of proteins have gained wide application for their efficiency and ease of use in editing genomes in vivo. Using CHAMP, I, in collaboration with experimentalists in Ilya Finkelstein’s lab, investigated the mechanism and sequence preference of the CRISPR Cascade complex, and discovered a novel periodic lack of sequence specificity in DNA binding. I further determined specific nucleotides important for recruitment of and processing by the nuclease domain, Cas3. Third, I present a meta-analysis of the order Chiroptera, the order of bats, using the new wealth of DNA sequence information of eighteen bat species. The transcriptome sequencing data for two of these bats—Hypsignathus monstrosus and Rousettus aegyptiacus, bats associated with studies of the Ebola and Marburg viruses respectively— is novel to this study. Using all this DNA sequence information, I reconstructed a high- confidence Chiropteran phylogeny and found 299 genes with signatures of positive selection, a signature associated with viral antagonism. Further study of these genes may shed light on the mechanism through which several bat viruses relevant to human health hijack the cell, including SARS, Ebola, Hendra, and Nipah