TexasScholarWorks
    • Login
    • Submit
    View Item 
    •   Repository Home
    • Conference Proceedings and Journals
    • TACCSTER Proceedings
    • TACCSTER 2019 Proceedings
    • View Item
    • Repository Home
    • Conference Proceedings and Journals
    • TACCSTER Proceedings
    • TACCSTER 2019 Proceedings
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Supervised Community Detection in Protein-interaction Networks

    Thumbnail
    View/Open
    Palukuri_Poster.pdf (1.488Mb)
    Date
    2019
    Author
    Palukuri, Meghana
    Marcotte, Edward
    Share
     Facebook
     Twitter
     LinkedIn
    Metadata
    Show full item record
    Abstract
    Community detection problems arise in several fields with networks, from biology and medicine to social studies and cybersecurity. Networks in these fields tend to be massive - for instance hu.MAP, the human protein interaction network assembled by our lab has 17 million edges. The problem of community detection here translates to finding protein complexes, which will help advance our understanding of several cellular functions and disease mechanisms. To solve this computationally challenging big data problem, we developed Super.Complex (short for Supervised Complex), a computational pipeline employing auto-ML and subgraph sampling techniques. While most state-of-the-art algorithms employ unsupervised graph clustering methods, a supervised approach holds more promise towards finding accurate communities mimicking the real world. With data on known communities becoming increasingly available in many applications, supervised methods become more relevant. Super.Complex implements a streamlined algorithm which samples subgraphs from the weighted network and classifies them as communities or non-communities via a supervised ML model. The steps involved are (i) sampling non-community data as random walks on the graph (ii) feature extraction and selection for known communities and generated non-communities, (iii) autoML pipeline for identification and training of thebest supervised machine learning model for binary classification of subgraphs (iv) intelligent sampling of candidate subgraphs for classification via 3 search techniques – greedy, iterative simulated annealing and metropolis. The last step is in fact a solution to the NP hard problem of identifying maximally scoring subgraphs in a network. The algorithm is applied to real data of different human and yeast protein interaction networks, yielding F1 scores ranging from 0.96 to 0.99 and identifying previously unknown biological complexes. Further, Super.Complex outperforms many state-of-the-art algorithms both in terms of accuracy and performance, with scalability to huge networks through its distributed framework.
    Department
    Texas Advanced Computing Center (TACC)
    Subject
    networks
    community detection
    URI
    https://hdl.handle.net/2152/79826
    http://dx.doi.org/10.26153/tsw/6852
    Collections
    • TACCSTER 2019 Proceedings

    University of Texas at Austin Libraries
    • facebook
    • twitter
    • instagram
    • youtube
    • CONTACT US
    • MAPS & DIRECTIONS
    • JOB OPPORTUNITIES
    • UT Austin Home
    • Emergency Information
    • Site Policies
    • Web Accessibility Policy
    • Web Privacy Policy
    • Adobe Reader
    Subscribe to our NewsletterGive to the Libraries

    © The University of Texas at Austin

     

     

    Browse

    Entire RepositoryCommunities & CollectionsDate IssuedAuthorsTitlesSubjectsDepartmentsThis CollectionDate IssuedAuthorsTitlesSubjectsDepartments

    My Account

    Login

    Statistics

    View Usage Statistics

    Information

    About Contact Policies Getting Started Glossary Help FAQs

    University of Texas at Austin Libraries
    • facebook
    • twitter
    • instagram
    • youtube
    • CONTACT US
    • MAPS & DIRECTIONS
    • JOB OPPORTUNITIES
    • UT Austin Home
    • Emergency Information
    • Site Policies
    • Web Accessibility Policy
    • Web Privacy Policy
    • Adobe Reader
    Subscribe to our NewsletterGive to the Libraries

    © The University of Texas at Austin