Scaling up DNA computation with next-generation sequencing and modified nucleic acids




Wang, Siyuan Stella

Journal Title

Journal ISSN

Volume Title



A central goal of biomolecular engineering is the construction of tools to manipulate nanoscale processes. DNA has proved to be a programmable material suited for this task. DNA strand displacement reactions can be designed to process chemical information in the form of concentrations and sequences. DNA nanotechnology has thus far produced devices for the detection of disease biomarkers, performed computation on chemical inputs, powered mechanical action at both the nanoscale and the macroscale, and assembled precise sub-micron structures from the bottom up.

This dissertation addresses three main topics. First, we develop predictive models for non-canonical nucleic acid hybridization that enable rational design. Second, we show how rationally designed DNA strand displacement reactions can be used to perform computations on information stored in DNA. Third, we present nucleic acid computation with both strand displacement and transcription and discuss strategies for facilitating the scale up of networks. Finally, we discuss data storage in nucleic acid variants in the appendix.

Rational design of DNA circuits and structures is possible because the thermodynamics of DNA and RNA hybridization can be approximated using a nearest-neighbor model. The parameters of this model are typically experimentally determined through the hyperchromism of denatured nucleic acids. This is measured through low-throughput UV-Vis spectrophotometry melting experiments that require a sizable amount of duplexes for a large set of sequences. For non-canonical nucleic acids or non-standard interactions, this characterization can be prohibitively costly and time consuming. Initially, we considered repurposing a next-generation sequencing (NGS) platform for high-throughput mapping of nucleic acid hybridization across a large sequence space; however, we found that the platform is suitable for mapping protein-nucleic acid interactions but not nucleic acid-nucleic acid hybridization due to its dynamic range. We then assessed whether high-resolution melting (HRM) can be used as a rapid method for determining approximate model parameters and found that HRM models can predict relative stabilities between duplexes of different sequences. Using this method, we developed a predictive model for phosphorothioate DNA which we then applied to the design of a phosphorothioate-modified catalytic hairpin assembly circuit.

DNA strand displacement reactions can be used not only to manipulate chemical information in the form of concentration, but also to read and write to more permanent forms of information, such as sequence and secondary structure. We developed and demonstrated a DNA data storage scheme that enables in-memory computation. DNA is a promising data storage medium for meeting today's rapidly growing data storage needs; however, because computation on the stored data is usually performed in silico, strands must be sequenced and re-synthesized at every read-write cycle. Our scheme circumvents the bottleneck of de novo oligonucleotide synthesis by updating information using strand displacement cascades that result in sequence changes readable by NGS. We experimentally demonstrated two algorithms - binary counting and cellular automaton Rule 110 - and additionally showed that biologically-occurring DNA sequences without sequence design can be repurposed for storage and computation. Our scheme is capable of computation on multiple data in parallel, as well as random access and sequential computation, allowing for scaled up storage.

Programmable chemical computation is also possible with enzymatic reactions such as transcription. Catalytic activity from enzymes has the potential to simplify circuit design and produce biologically potent signals. Practical concerns to expanding chemical computation circuits such as transcription networks include limited readout of signals and time-consuming purification. We addressed these concerns by expanding on previous efforts to build scalable in vitro transcription networks. We updated a single-stranded inhibitory transcription switch design for compatibility with multiplexed NGS readout and developed an analogous single-stranded switch that is activated by nucleic acid signals.


LCSH Subject Headings