Promoter element engineering towards modulated and consistent gene expression
Metabolic engineering requires precise control over expression strength, which can be effectively accomplished through engineering of the promoter element. In this regard, there is a need in the field to develop synthetic promoters that function predictably and consistently. To enable success in these engineering endeavors, an understanding of the currently elusive sequence-to-function relationship of these promoter elements is needed. This work advances our understanding of promoter architecture and has developed several methods by which this understanding can be leveraged to create new synthetic promoters. Specifically, this work produces modular synthetic components of promoters that can be combined to function predictably and consistently in a variety of circumstances. First, we show that promoters in E. coli can be divided into two separable components, the core promoter and the upstream element (UP element), and through mutagenesis of the UP element region alone, we create a suite of new promoters capable of up to 9-fold activation of expression of a core promoter. Further, we showcase the modularity of these UP elements by placing them upstream of different core promoters and observing conserved levels of expression activation. Second, we show promoters in Saccharomyces cerevisiae can be engineered to perform consistently across exponential and stationary growth phase through a computational motif discovery approach. Here, we show that fragments of promoters containing these motifs from each growth phase can be combined to achieve a promoter with consistent and strong expression across both phases. We see a 38-fold increase in exponential phase signal when exponential motifs are inserted into a native promoter with high stationary phase expression. Further, we show this consistency is retained across multiple scales of growth by characterizing expression in microtiter plates, tubes, and flasks. Finally, we utilize an entirely computational approach to address challenges in computational classification of promoters in S. cerevisiae stemming from the length and complexity of these sequences. Deep learning algorithms capable of decoding higher order sequence patterns are only employed on small (<100 base pair) sequences, or datasets containing tens of thousands of examples. In this work, we show that inclusion of promoters from phylogenetically related species can improve upon the performance of convolutional neural nets trained to recognize promoting function in 800 base pair input sequences, compared to models trained on only the 6,000 available S. cerevisiae promoter sequences alone.