Computational approaches for structural analysis of large bio-molecular complexes
Large bio-molecular complexes (LBCs) are assemblies of dozens to thousands of structural/functional biomolecular subunits. Two representatives of LBCs are ribosomes and viruses. While ribosomes play an essential role in protein synthesis in cells, viruses are of extreme interest because they account for many serious illness in animals and plants. Even with today’s advanced vaccines and drugs, millions of people die from viral diseases worldwide each year. Studying the three-dimensional (3D) structures of LBCs and predicting their functions are of great importance in the life sciences. X-ray crystallography and nuclear magnetic resonance (NMR) are two major techniques for determining 3D structures of bio-molecules. These approaches, however, are quite often limited to individual proteins or relatively small assemblies. Cryo-electron microscopy (cryo-EM), coupled with single particle reconstruction, has become a powerful technique to reveal the “full picture” of a large bio-molecular complex, offering a more complete structural and functional description of the protein machinery. While the number of 3D cryo-EM maps at intermediate resolutions (5 − 20˚A) increased steadily in recent years, few efforts had been made towards automatic and quantitative interpretations of the reconstructed maps. Current ways for interpreting reconstructed maps depend mainly upon visual inspections with the help of various graphic tools. Due to the large physical size and complexity of the bio-molecular assemblies, it is not only tedious and subjective but also very difficult to visually interpret the detailed features/activities of an interacting bio-molecular system. For this reason, automatic structural analysis of large bio-molecular complexes has become increasingly and critically important. This dissertation presents various computational approaches for automatic structure analysis of LBCs, including symmetry detection, subunit segmentation, subunit alignment, secondary structure identification, and atomic structure modelling/fitting. The final output of the structure analysis of a LBC will be a pseudo-atomic structure model of the given LBC map. To achieve this goal, we also present a number of new algorithms on image enhancement, skeleton extraction, and image segmentation. We evaluate these algorithms on various types of images at medical, cellular, and molecular levels. In addition to the 3D image processing, we also present two algorithms for automatic particle picking in 2D electron micrographs.