Curating a Database of Genetic Biosensors and Implications for Machine Learning Guided Evolution
Access full-text files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Biosensors are a vital part of molecular biology, allowing cells to monitor stimuli from their external environment. They are also useful as tools in biotechnology for controlling gene expression using small molecule inputs. A database of naturally occurring biosensors was created by reviewing 1436 PubMed articles across 11 transcription regulator families. For each protein, the article link, its species source, UniProt ID, PDB entry, ligands, and operator sites were recorded. The entries were then classified as either having structural characterization, ligand characterization, operator site characterization, or application characterization. 693 database entries were from 4 major families: TetR, GntR, LacI, and LuxR. A primary motive for creating this database is for its use with convolutional neural nets in predicting novel and optimized structures for biosensors. A semi-supervised approach to the neural network is also considered due to the lack of consistent recording of data on transcription factors within the scientific community. With the use of convolutional neural nets, this database can usher in a new era in directed protein evolution – guided by artificial intelligence.