Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion

dc.contributor.advisorHillis, David M.en
dc.creatorZwickl, Derrick Joelen
dc.date.accessioned2008-08-28T23:01:34Zen
dc.date.available2008-08-28T23:01:34Zen
dc.date.issued2006en
dc.descriptiontexten
dc.description.abstractPhylogenetic trees have a multitude of applications in biology, epidemiology, conservation and even forensics. However, the inference of phylogenetic trees can be extremely computationally intensive. The computational burden of such analyses becomes even greater when model-based methods are used. Model-based methods have been repeatedly shown to be the most accurate choice for the reconstruction of phylogenetic trees, and thus are an attractive choice despite their high computational demands. Using the Maximum Likelihood (ML) criterion to choose among phylogenetic trees is one commonly used model-based technique. Until recently, software for performing ML analyses of biological sequence data was largely intractable for more vi than about one hundred sequences. Because advances in sequencing technology now make the assembly of datasets consisting of thousands of sequences common, ML search algorithms that are able to quickly and accurately analyze such data must be developed if ML techniques are to remain a viable option in the future. I have developed a fast and accurate algorithm that allows ML phylogenetic searches to be performed on datasets consisting of thousands of sequences. My software uses a genetic algorithm approach, and is named GARLI (Genetic Algorithm for Rapid Likelihood Inference). The speed of this new algorithm results primarily from its novel technique for partial optimization of branch-length parameters following topological rearrangements. Experiments performed with GARLI show that it is able to analyze large datasets in a small fraction of the time required by the previous generation of search algorithms. The program also performs well relative to two other recently introduced fast ML search programs. Large parallel computer clusters have become common at academic institutions in recent years, presenting a new resource to be used for phylogenetic analyses. The PGARLI algorithm extends the approach of GARLI to allow simultaneous use of many computer processors. The processors may be instructed to work together on a phylogenetic search in either a highly coordinated or largely independent fashion. Preliminary experiments suggest that analyses using the P-GARLI software can result in better solutions than can be obtained with the serial GARLI algorithm.
dc.description.departmentBiological Sciences, School ofen
dc.format.mediumelectronicen
dc.identifierb64905330en
dc.identifier.oclc85839175en
dc.identifier.urihttp://hdl.handle.net/2152/2666en
dc.language.isoengen
dc.rightsCopyright is held by the author. Presentation of this material on the Libraries' web site by University Libraries, The University of Texas at Austin was made possible under a limited license grant from the author who has retained all copyrights in the works.en
dc.subject.lcshPhylogeny--Data processingen
dc.titleGenetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterionen
dc.type.genreThesisen
thesis.degree.departmentBiological Sciences, School ofen
thesis.degree.disciplineEcology, Evolution, and Behavioren
thesis.degree.grantorThe University of Texas at Austinen
thesis.degree.levelDoctoralen
thesis.degree.nameDoctor of Philosophyen
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
zwickld81846.pdf
Size:
3.37 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.65 KB
Format:
Plain Text
Description: