Browsing by Subject "Computational biology"
Now showing 1 - 9 of 9
- Results Per Page
- Sort Options
Item Algorithms for next generation sequencing data analysis(2015-12) Das, Shreepriya; Vikalo, Haris; Dhillon, Inderjit S; Ravikumar, Pradeep; Sanghavi, Sujay; Tewfik, AhmedThe field of genomics has witnessed tremendous achievements in the past two decades. The advances in sequencing technology have enabled acquisition of massive amounts of data that reveals information about individual genetic blueprint and is revolutionizing the field of molecular biology. Interpretation of such data requires solving mathematical (statistical and computational) problems rendered difficult by the complex interacting processes that are characteristic of biological systems; the data is high dimensional, typically noisy and often incomplete. Algorithm design in these settings requires deep understanding of the underlying biological principles, good mathematical abstractions permitting tractable inference and fast, scalable and accurate solutions using ideas from diverse fields such as optimization, probability, statistics and algorithms. This dissertation deals with two such problems occurring in the field of bioinformatics/computational biology. First, for the problem of basecalling for sequencing-by-synthesis (Illumina) platforms, I describe novel computationally tractable statistical models and signal processing schemes that are fast and have lower error rates than existing state-of-the-art basecallers. Extensions to a soft information exchange setup to do joint basecalling and SNP calling are also explored. Next, I describe two novel single individual haplotyping inference schemes using an (optimal) branch and bound framework and (scalable) low rank semidefinite programming ideas for diploid and polyploid species. In addition to improving the quality of basecalling, SNP calling, genotyping and haplotyping, I also developed user-friendly software that can be used by the biological research community for various purposes including cancer genomics and metagenomics studies.Item Development of a computational-experimental model to predict glioma response to radiation treatment(2021-11-29) Liu, Junyan, Ph. D.; Yankeelov, Thomas E.; Yeh, Hsin-chih; Brock, Amy; Vasquez, Karen M.Radiation is essential to malignant glioma and glioblastoma treatment. However, the prognosis of glioblastoma remains poor with a median survival of 15 months. This is partly due to the heterogeneous radiosensitivity among patients. If we have a mechanism-based model that can make dynamic predictions, it has the potential to guide and optimize the treatment on a patient-specific basis. The purpose of this dissertation is to develop and validate a computational-experimental model that explicitly incorporates underlying radiobiology as well as making accurate predictions of the radiation response of glioma cells. Specifically, we first propose a mathematical model to a single dose of radiation that incorporates DNA repair and cell death pathways and validate it under eight different doses from 2 Gy to 16 Gy via microscopy in vitro. We then extend this model to fractionated treatment and validate it with six different fractionation schemes using total doses of either 16 Gy or 20 Gy. Finally, we propose a data assimilation framework that will individualize the prediction based on the observations of individual replicates, which further improves the prediction accuracy. We present a full story of how developing a mechanism-based experiment-driven mathematical model can assist us in characterizing and predicting radiation response, which could eventually, be used to optimize the treatment schedule.Item Integrated systems approach for mechanistic understanding of human cancers(2023-12) Burgman, Brandon; Yi, Song (Stephen); Tiziani, Stefano; Maynard, Jennifer; Brock, AmyResearch described in this dissertation has focused on cancer systems biology, aiming to tackle fundamental questions underlying genotype-phenotype relationships. I combine my unique computational and experimental expertise to build a quantitative understanding of genetic/epigenetic variation and signaling network perturbation in cancer. Understanding the functional impact of cancer somatic mutations represents a critical knowledge gap for implementing precision oncology. To resolve this challenge, we developed e-MutPath, a network-based computational method to identify candidate ‘edgetic’ mutations that perturb functional pathways. In a specific context, alterations in immune-related pathways are common hallmarks of cancer. Herein we developed a Network-based Integrative model to Prioritize Potential immune respondER genes (NIPPER). Using an interactome network propagation framework integrated with drug associated gene signatures, we identified potential immunomodulatory drug candidates. By developing an integrated multi-omics model, I further discovered widespread epigenetic perturbations in colorectal cancer, with a clear dependency on tumor sidedness. This result provides a possible reason why right-sided colorectal cancer leads to overall worse prognosis. More importantly, to reveal molecular interaction networks, I invented a high- throughput wet-lab technology to investigate contextual dependencies in the human RNA- protein interactome with Exosome-mediated RNA-Protein Trafficking (ExRPT). Combined with single-molecule barcoding and detection, our platform aims to provide a flexible interface that integrates complex libraries of recombinant RNAs, coupled with fluorescence-based sorting and barcode sequencing. Additionally, lipid vesicles are thought to have a protective role in shielding cargo RNA from enzymatic degradation and provide a closed environment to maintain RNA stability. Through this we seek to associate conditions that alter epigenetic signaling networks that respond to the extracellular environment and drive diverse disease states. Last but not least, we developed a predictive framework ‘i-Modern’, a robust deep learning framework for integrating multi-omics data to efficiently and precisely stratify cancer patients and predict survival prognosis. Together, the landmark multi-omics signatures identified here may serve as potential therapeutic targets in cancer.Item Modeling the interaction and energetics of biological molecules with a polarizable force field(2013-05) Shi, Yue, active 21st century; Ren, PengyuAccurate prediction of protein-ligand binding affinity is essential to computational drug discovery. Current approaches are limited by the accuracy of the underlying potential energy model that describes atomic interactions. A more rigorous physical model is critical for evaluating molecular interactions to chemical accuracy. The objective of this thesis research is to develop a polarizable force field with an accurate representation of electrostatic interactions, and apply this model to protein-ligand recognition and to ultimately solve practical problems in computer aided drug discovery. By calculating the hydration free energies of a series of organic small molecules, an optimal protocol is established to develop the electrostatic parameters from quantum mechanics calculations. Next, the systematical development and parameterization procedure of AMOEBA protein force field is presented. The derived force field has gone through extensive validations in both gas phase and condensed phase. The last part of the thesis involves the application of AMOEBA to study protein-ligand interactions. The binding free energies of benzamidine analogs to trypsin using molecular dynamics alchemical perturbation are calculated with encouraging accuracy. AMOEBA is also used to study the thermodynamic effect of constraining and hydrophobicity on binding energetics between phosphotyrosine(pY)-containing tripeptides and the SH2 domain of growth receptor binding protein 2 (Grb2). The underlying mechanism of an "entropic paradox" associated with ligand preorganization is explored.Item Molecular investigation of polypyrrole and surface recognition by affinity peptides(2011-12) Fonner, John Michael; Ren, Pengyu; Schmidt, Christine E.; Elber, Ron; Roy, Krishnendu; Georgiou, GeorgeSuccessful tissue engineering strategies in the nervous system must be carefully crafted to interact favorably with the complex biochemical signals of the native environment. To date, all chronic implants incorporating electrical conductivity degrade in performance over time as the foreign body reaction and subsequent fibrous encapsulation isolate them from the host tissue. Our goal is to develop a peptide-based interfacial biomaterial that will non-covalently coat the surface of the conducting polymer polypyrrole, allowing the implant to interact with the nervous system through both electrical and chemical cues. Starting with a candidate peptide sequence discovered through phage display, we used computational simulations of the peptide on polypyrrole to describe the bound peptide structure, explore the mechanism of binding, and suggest new, better binding peptide sequences. After experimentally characterizing the polymer, we created a molecular mechanics model of polypyrrole using quantum mechanics calculations and compared its in silico properties to experimental observables such as density and chain packing. Using replica exchange molecular dynamics, we then modeled the behavior of affinity binding peptides on the surface of polypyrrole in explicit water and saline environments. Relative measurements of the contributions of each amino acid were made using distance measurements and computational alanine scanning.Item Numerical methods for simulations and optimization of vesicle flows in microfluidic devices(2019-05) Kabacaoğlu, Gökberk; Biros, George; Ghattas, Omar; Moser, Robert; Shelley, MichaelVesicles are highly deformable particles that are filled with a Newtonian fluid. They resemble biological cells without a nucleus such as red blood cells (RBCs). Vesicle flow simulations can be used to design microfluidic devices for medical diagnoses and drug delivery systems. This dissertation focuses on efficient numerical methods for simulations and optimization of vesicle flows in two dimensions. We consider flows with very low Reynolds numbers and inextensible vesicle membranes that resist bending. Our numerical scheme is based on a boundary integral formulation which is known to be efficient for such flows. This formulation leads to a set of nonlinear integro-differential equations for the vesicle dynamics. Complex interplay between the nonlocal hydrodynamic forces and the membranes’ elasticity determines the vesicles’ motion. Many state-of-the-art numerical schemes can resolve these complex flows. However, simulations remain computationally expensive since high-resolution discretization is needed. The high computational cost limits the use of the simulations for practical purposes such as optimization. Our first attempt to reduce the cost is to use low-resolution discretization. We present a scheme that systematically integrates several correction algorithms that are necessary for stable and accurate low-resolution simulations. We compare the low-resolution simulations with their high-fidelity counterparts. We observe that our scheme enables both fast and statistically accurate simulations. We accelerate vesicle flow simulations further by replacing expensive parts of the numerical scheme with low-cost function approximations. We propose a machine-learning-augmented reduced model that uses several multilayer perceptrons to model different aspects of the flows. Although we train the perceptrons with high-fidelity single-particle simulations for one time step, our method enables us to conduct long-horizon simulations of suspensions with several particles in confined geometries. It is faster than a state-of-the-art numerical scheme having the same number of degrees of freedom and can reproduce several features of the flow accurately. It generalizes as is to other particles like deformable capsules, drops, filaments and rigid bodies. Moreover, we investigate deformability-based sorting of RBCs using a microfluidic device that enables medical diagnoses of diseases such as malaria. Using our numerical scheme we solve a design optimization problem to find optimal designs of the device that provide efficient sorting of cells with arbitrary mechanical propertiesItem One cell as a mixture : simulation of the mechanical responses of valve interstitial cells(2016-08) Sakamoto, Yusuke; Sacks, Michael S.; Prudhomme, Serge; Gonzalez, Oscar; Ghattas, Omar; Rodin, Gregory J; Guilak, FarshidThe function of the heart valve interstitial cells (VICs) are intimately connected to heart valve tissue remodeling and repair as well as initiation of pathological processes. It is known that excessive and persisting environmental changes cause the improper regulations of VICs, and a clinically significant valve pathology may result. Much of VIC function is modulated through changes in stress fiber activation, resulting in part from changes in external loading by the surrounding extracellular matrix (ECM) and cytokines. Thus, current research challenges aim at characterizing the mechanisms that activate VIC contractility, and at modeling the mechanical interactions of contractile VICs with the surrounding valve matrix. Especially, many questions remain as how stress fibers develop active contractile forces under varying normal and pathological conditions. The main objective of this dissertation is to develop a novel computational model of a VIC capable of describing its mechanical response under different external stimuli and activation states. To this end, solid mixture model framework of a VIC is developed, where the VIC cytoplasm is treated as a solid mixture of two phases: isotropic cytoskeleton and stress fibers with some orientations. The stress fiber model is then incrementally extended to capture more and more complex mechanical responses of VICs. The finite element simulations are performed with the aid of experimental data to investigate how the internal mechanics of VICs, such as solid cytoskeletal network, contracting stress fibers, and cell nucleus, affect the mechanical responses of VICs within a native tissue. The development of the computational model of a VIC as well as its numerical implementation are critical to study the heart valve disease in cellular level because of the complexity of the mechanisms and difficulty of directly analyzing the subcellular mechanics. The computational model in conjunction with experimental data provide insight into how the VICs respond within the native valve tissue, and how the heart valve disease may initiate. This dissertation is the first step towards developing prevention mechanisms and cure for the heart valve disease from cellular and subcellular levels.Item Structural proteomics towards studying protein complexes in cellular context(2022-08-18) McCafferty, Caitlyn L.; Marcotte, Edward M.; Taylor, David W., Jr.; Wallingford, John B; Wilke, Claus; Durand, DannieRecent advances in cryo-electron microscopy (cryo-EM) allow for rapid and direct visualization of high-resolution three-dimensional (3D) structures. Unfortunately, current methods are often limited to identification of and structure determination for highly purified protein samples, preventing characterization of protein complex structures in their native form. Being able to determine a protein structure in the native biological “habitat” by imaging, for example, an unpurified cell lysate, is essential to understanding its role in the mechanisms that drive biological processes. Methods such as shotgun-electron microscopy (shotgun-EM) and cryo-electron tomography (cryo-ET) aim to study structural biology in the biological context of the cell, which allows for the definition of native interaction partners and identification of spatial information. However, a major and frequent tradeoff of imaging biological assemblies in a near-native context is a decrease in structure resolution. Integrating cryo-EM with complementary techniques such as mass spectrometry can allow for characterization of protein architectures without purifying for a specific target. Furthermore, protein-protein interactions identified by mass spectrometry or other computational methods, in conjunction with cutting-edge methods for 3D structure determination and prediction, can actually be used to investigate the architecture of multiple distinct protein complexes from a single mixture. Here, I detail computational methods that I have developed for determining protein interaction interfaces based on surface complementarity. This serves as a useful tool for determining how protein subunits may fit together within an EM map. I then describe an approach for using intermolecular evolutionary couplings between amino acids interfaces to assemble proteins into intermediate- to low-resolution EM maps. Because both of these methods are dependent on reliable protein structure, I then used in situ cross-linking mass spectrometry (XL/MS) to evaluate and improve the quality of AlphaFold predicted protein structures. Finally, I present the use of these methods to study native ciliary protein architectures, such as the intraflagellar transport A (IFT-A) complex. Ciliopathies are a class of diseases that arise from dysfunction in cilia. I show that using integrative methods to study the IFT-A architecture allows for the molecular interpretation of disease-associated alleles. The development and integration of these methods provides a toolbox for studying ciliary assemblies in their native environment and interpreting these lower resolution assemblies with the goal of providing a blueprint for the molecular basis of disease.Item A systems approach to computational protein identification(2010-05) Ramakrishnan, Smriti Rajan; Miranker, Daniel P.; Dhillon, Inderjit S.; Marcotte, Edward M.; Mooney, Raymond J.; Press, William H.Proteomics is the science of understanding the dynamic protein content of an organism's cells (its proteome), which is one of the largest current challenges in biology. Computational proteomics is an active research area that involves in-silico methods for the analysis of high-throughput protein identification data. Current methods are based on a technology called tandem mass spectrometry (MS/MS) and suffer from low coverage and accuracy, reliably identifying only 20-40% of the proteome. This dissertation addresses recall, precision, speed and scalability of computational proteomics experiments. This research goes beyond the traditional paradigm of analyzing MS/MS experiments in isolation, instead learning priors of protein presence from the joint analysis of various systems biology data sources. This integrative `systems' approach to protein identification is very effective, as demonstrated by two new methods. The first, MSNet, introduces a social model for protein identification and leverages functional dependencies from genome-scale, probabilistic, gene functional networks. The second, MSPresso, learns a gene expression prior from a joint analysis of mRNA and proteomics experiments on similar samples. These two sources of prior information result in more accurate estimates of protein presence, and increase protein recall by as much as 30% in complex samples, while also increasing precision. A comprehensive suite of benchmarking datasets is introduced for evaluation in yeast. Methods to assess statistical significance in the absence of ground truth are also introduced and employed whenever applicable. This dissertation also describes a database indexing solution to improve speed and scalability of protein identification experiments. The method, MSFound, customizes a metric-space database index and its associated approximate k-nearest-neighbor search algorithm with a semi-metric distance designed to match noisy spectra. MSFound achieves an order of magnitude speedup over traditional spectra database searches while maintaining scalability.