Show simple item record

dc.creatorSingh-Blom, U. Martinen
dc.creatorNatarajan, Nagarajanen
dc.creatorTewari, Ambujen
dc.creatorWoods, John O.en
dc.creatorDhillon, Inderjit S.en
dc.creatorMarcotte, Edward M.en
dc.identifier.citationSingh-Blom UM, Natarajan N, Tewari A, Woods JO, Dhillon IS, et al. (2013) Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses. PLoS ONE 8(5): e58977. doi:10.1371/journal.pone.0058977en
dc.descriptionU. Martin Singh-Blom is with UT Austin and Karolinska Institutet, Nagarajan Natarajan is with UT Austin, Ambuj Tewari is with University of Michigan, John O. Woods is with UT Austin, Inderjit S. Dhillon is with UT Austin, and Edward M. Marcotte is with UT Austin.en
dc.description.abstractCorrectly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called CATAPULT (Combining dATa Across species using Positive-Unlabeled Learning Techniques), is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas CATAPULT is better suited to correctly identifying gene-trait associations overall. The authors want to thank Jon Laurent and Kris McGary for some of the data used, and Li and Patra for making their code available. Most of Ambuj Tewari's contribution to this work happened while he was a postdoctoral fellow at the University of Texas at Austin.en
dc.description.sponsorshipThis work was supported by grants from the U.S. Army Research (58343-MA) to EMM and ISD, from the Cancer Prevention & Research Institute of Texas (CPRIT), U.S. National Science Foundation, United States National Institutes of Health, Welch Foundation (F1515), and the Packard Foundation to EMM, and from DOD Army (W911NF-10-1-0529), U.S. National Science Foundation (CCF-0916309) and the Moncrief Grand Challenge Award to ISD. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.en
dc.publisherPublic Library of Scienceen
dc.rightsAttribution 3.0 United Statesen
dc.subjectDrug informationen
dc.subjectDrug interactionsen
dc.subjectGene regulatory networksen
dc.subjectGenetic networksen
dc.subjectHuman learningen
dc.subjectProtein interaction networksen
dc.subjectSupport vector machinesen
dc.titlePrediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analysesen
dc.description.departmentCellular and Molecular Biologyen

Files in this item


This item appears in the following Collection(s)

Show simple item record

Attribution 3.0 United States
Except where otherwise noted, this item's license is described as Attribution 3.0 United States