SDhaP: Haplotype Assembly for Diploids and Polyploids Via Semi-Definite Programming
dc.contributor.utaustinauthor | Das, Shreepriya | en_US |
dc.contributor.utaustinauthor | Vikalo, Haris | en_US |
dc.creator | Das, Shreepriya | en_US |
dc.creator | Vikalo, Haris | en_US |
dc.date.accessioned | 2016-10-28T19:53:30Z | |
dc.date.available | 2016-10-28T19:53:30Z | |
dc.date.issued | 2015-04 | en_US |
dc.description.abstract | The goal of haplotype assembly is to infer haplotypes of an individual from a mixture of sequenced chromosome fragments. Limited lengths of paired-end sequencing reads and inserts render haplotype assembly computationally challenging; in fact, most of the problem formulations are known to be NP-hard. Dimensions (and, therefore, difficulty) of the haplotype assembly problems keep increasing as the sequencing technology advances and the length of reads and inserts grow. The computational challenges are even more pronounced in the case of polyploid haplotypes, whose assembly is considerably more difficult than in the case of diploids. Fast, accurate, and scalable methods for haplotype assembly of diploid and polyploid organisms are needed. Results: We develop a novel framework for diploid/polyploid haplotype assembly from high-throughput sequencing data. The method formulates the haplotype assembly problem as a semi-definite program and exploits its special structure - namely, the low rank of the underlying solution - to solve it rapidly and with high accuracy. The developed framework is applicable to both diploid and polyploid species. The code for SDhaP is freely available at https://sourceforge.net/projects/sdhap. Conclusion: Extensive benchmarking tests on both real and simulated data show that the proposed algorithms outperform several well-known haplotype assembly methods in terms of either accuracy or speed or both. Useful recommendations for coverages needed to achieve near-optimal solutions are also provided. | en_US |
dc.description.department | Electrical and Computer Engineering | en_US |
dc.description.sponsorship | National Science Foundation CCF-1320273 | en_US |
dc.identifier | doi:10.15781/T2XK84T0R | |
dc.identifier.citation | Das, Shreepriya, and Haris Vikalo. "SDhaP: haplotype assembly for diploids and polyploids via semi-definite programming." BMC genomics, Vol. 16, No. 1 (Apr., 2015): 1. | en_US |
dc.identifier.doi | 10.1186/s12864-015-1408-5 | en_US |
dc.identifier.issn | 1471-2164 | en_US |
dc.identifier.uri | http://hdl.handle.net/2152/43347 | |
dc.language.iso | English | en_US |
dc.relation.ispartof | en_US | |
dc.relation.ispartofserial | BMC Genomics | en_US |
dc.rights | Administrative deposit of works to Texas ScholarWorks: This works author(s) is or was a University faculty member, student or staff member; this article is already available through open access or the publisher allows a PDF version of the article to be freely posted online. The library makes the deposit as a matter of fair use (for scholarly, educational, and research purposes), and to preserve the work and further secure public access to the works of the University. | en_US |
dc.rights.restriction | Open | en_US |
dc.subject | haplotype assembly | en_US |
dc.subject | semi-definite programming | en_US |
dc.subject | diploid | en_US |
dc.subject | polyploid | en_US |
dc.subject | genome sequence data | en_US |
dc.subject | grothendiecks inequality | en_US |
dc.subject | reconstruction | en_US |
dc.subject | algorithms | en_US |
dc.subject | cut | en_US |
dc.subject | biotechnology & applied microbiology | en_US |
dc.subject | genetics & heredity | en_US |
dc.title | SDhaP: Haplotype Assembly for Diploids and Polyploids Via Semi-Definite Programming | en_US |
dc.type | Article | en_US |