Network-based strategies for discovering functional associations of uncharacterized genes and gene sets




Wang, Peggy I.

Journal Title

Journal ISSN

Volume Title



High-throughput technology is changing the face of research biology, generating an ever growing amount of large-scale data sets. With experiments utilizing next-generation gene sequencing, mass spectrometry, and various other global surveys of proteins, the task of translating the plethora of data into biology has become a daunting task. In response, functional networks have been developed as a means for integrating the data into models of proteomic organization. In these networks, proteins are linked if they are evidenced to operate together in the same function, facilitating predictions about the functions, phenotypes, and disease associations of uncharacterized genes. In this body of work, we explore different applications of this so-called "guilt-by-association" concept to predict loss-of-function phenotypes and diseases associated with genes in yeast, worm, and human. We also scrutinize certain limitations associated with the functional networks, predictive methods, and measures of performance used in our studies. Importantly, the predictive method and performance measure, if not chosen appropriately for the biological objective at hand, can largely distort the results and interpretation of a study. These findings are incorporated in the development of RIDDLE, a method for characterizing whole sets of genes. This machine learning-based method provides a measure of network distance, and thus functional association, between two sets of genes. RIDDLE may be applied to a wide range of potential applications, as we demonstrate with several biological examples, including linking microRNA-450a to ocular development and disease. In the last decade, functional networks have proven to be a useful strategy for interpreting large-scale proteomic and genomic data sets. With the continued growth of genome coverage in networks and the innovation of predictive methods, we will surely advance towards our ultimate goal of understanding the genetic changes that underlie disease.



LCSH Subject Headings