Large-scale phylogenetic analysis
The phylogeny problem is to reconstruct the phylogenetic tree in which the leaves are labeled by the taxa we are interested in, and the internal nodes are ancestral taxa. Recent advances in molecular biology and genomics have provided biologists with molecular data at an unprecedented rate and scale; in particular whole genome data for more and more species. First, the number of possible phylogenetic trees grows superexponentially with the increase of the number of species being studied. Second, detailed sequence data for each species usually convey conflict. Third, more species usually means more evolutionary events along the evolutionary tree. This usually leads to highly saturated data, which are difficult to analyze in general. In this thesis I present two possible approaches to solve this difficulty. The first approach is to use genome rearrangement evolution, an evolutionary process that has lower evolutionary rate than DNA sequence evolution. The second approach is to process multiple trees returned by tree reconstruction algorithms by applying clustering methods.