MARLEDA: effective distribution estimation through Markov random fields
Many problems within the biological sciences, such as DNA sequencing, protein structure prediction, and molecular docking, are being approached computationally. These problems require sophisticated solution methods that understand the complex natures of biological domains. Traditionally, such solution methods are problem specific, but recent advances in generic problem-solvers furnish hope for a new breed of computational tools. The challenge is to develop methods that can automatically learn or acquire an understanding of a complex problem domain. Estimation of Distribution Algorithms (EDAs) are generic search methods that use statistical models to learn the structure of a problem domain. EDAs have been successfully applied to many difficult search problems, such as circuit design, optimizing Ising spin glasses, and various scheduling tasks. However, current EDAs contain ad hoc limitations that reduce their capacity to solve hard problems. This dissertation presents a new EDA method, the Markovian Learning Estimation of Distribution Algorithm (MARLEDA), that employs a Markov random field model. The model is learned in a novel way that overcomes previous ad hoc limitations. MARLEDA is shown to perform well on standard benchmark search tasks. A multiobjective extension of MARLEDA is developed for use in predicting the secondary structure of RNA molecules. The extension is shown to produce high-quality predictions in comparison with several contemporary methods, laying the groundwork for a new computational tool for RNA researchers.