Genome-wide approaches to explore transcriptional regulation in eukaryotes

Park, Daechan

Genome-wide approaches to explore transcriptional regulation in eukaryotes

Access full-text files

PARK-DISSERTATION-2014.pdf (11.05 MB)

Date

2014-05

Authors

Park, Daechan

Abstract

Transcriptional regulation is a complicated process controlled by numerous factors such as transcription factors (TFs), chromatin remodeling enzymes, nucleosomes, post-transcriptional machineries, and cis-acting DNA sequence. I explored the complex transcriptional regulation in eukaryotes through three distinct studies to comprehensively understand the functional genomics at various steps. Although a variety of high throughput approaches have been developed to understand this complex system on a genome wide scale with high resolution, a lack of accurate and comprehensive annotation transcription start sites (TSS) and polyadenylation sites (PAS) has hindered precise analyses even in Saccharomyces cerevisiae, one of the simplest eukaryotes. We developed Simultaneous Mapping Of RNA Ends by sequencing (SMORE-seq) and identified the strongest TSS and PAS of over 90% of yeast genes with single nucleotide resolution. Owing to the high accuracy of TSS identified by SMORE-seq, we detected possibly mis-annotated 150 genes that have a TSS downstream of the annotated start codon. Furthermore, SMORE-seq showed that 5’-capped non-coding RNAs were highly transcribed divergently from TATA-less promoters in wild-type cells under normal conditions. Mapping of DNA-protein interactions is essential to understanding the role of TFs in transcriptional regulation. ChIP-seq is the most widely used method for this purpose. However, careful attention has not been given to technical bias reflected in final target calling due to many experimental steps of ChIP-seq including fixation and shearing of chromatin, immunoprecipitation, sequencing library construction, and computational analysis. While analyzing large-scale ChIP-seq data, we observed that unrelated proteins appeared to bind to the gene bodies of highly transcribed genes across datasets. Control experiments including input, IgG ChIP in untagged cells, and the Golgi factor Mnn10 ChIP also showed the strong binding at the same loci, indicating that the signals were obviously derived from bias that is devoid of biological meaning. In addition, the appearance of nucleosomal periodicity in ChIP-seq data for proteins localizing to gene bodies is another bias that can be mistaken for false interactions with nucleosomes. We alleviated these biases by correcting data with proper negative controls, but the biases could not be completely removed. Therefore, caution is warranted in interpreting the results from ChIP-seq. Nucleosome positioning is another critical mechanism of transcriptional regulation. Global mapping of nucleosome occupancy in S. cerevisiae strains deleted for chromatin remodeling complexes has elucidated the role of these complexes on a genome wide scale. In this study, loss of chromodomain helicase DNA binding protein 1 (Chd1) resulted in severe disorganization of nucleosome positioning. Despite the difficulties of performing ChIP-seq for chromatin remodeling complexes due to their transient and dynamic localization on chromatin, we successfully mapped the genome-wide occupancy of Chd1 and quantitatively showed that Chd1 co-localizes with early transcription elongation factors, but not late transcription elongation factors. Interestingly, Chd1 occupancy was independent of the methylation levels at H3K36, indicating the necessity of a new working model describing Chd1 localization.