Motif-informed analysis of phenotype heterogeneity in cancer

Access full-text files



Journal Title

Journal ISSN

Volume Title



The landscape of cancer genomics harbors a wealth of DNA motifs, whose thorough analysis and integration provide a pivotal method to decipher the complex molecular interactions underlying cancer. This dissertation delineates novel computational methodologies for robust DNA motif analysis and data integration, aiming to elucidate the implications of DNA motifs on cancer heterogeneity and clinical outcomes. Chapter 1 lays the groundwork by showing the significance of DNA motifs in the genomic framework and delineating the current biomarkers in cancer. It highlights the opportunity that DNA motif analysis presents in unveiling a nuanced understanding of genomic interactions. It also indicates the motivations and specific aims of the study of both DNA motif quantification and co-localization analysis. In Chapter 2, a foundational marker for quantifying the prevalence of DNA repetitive motifs, termed as “Non-B DNA Burden”, is introduced. A user-centric platform is also developed to facilitate the efficient computation and visualization of this metric across various genomic scales. Together, they are offering a novel perspective for analyzing DNA motif heterogeneity. Transitioning to Chapter 3, the focus evolves toward an integrated marker approach. By integrating the prevalence analysis of DNA motifs in conjunction with the frequency of co-localized mutations, novel markers mlTNB (mutation-localized total non-B burden) and nbTMB (non-B informed tumor mutation burden) are proposed. Their potential in predicting cancer prognosis and treatment responses is specifically explored. Chapter 4 broadens the analytical foundation by defining MoCoLo (Motif Co-Localization), a robust statistical framework for testing multi-modal DNA motif co-localization. Through this framework, we are able to explore the complex interplay of genomic features and provide a methodical approach to investigate their co-localization in a multi-modal data integration context. Case studies are employed to showcase the utility of MoCoLo in examining the co-localization of genomic features, thus facilitating the understanding of genomic interactions that are pivotal to cancer biology. Chapter 5 synthesizes the findings from the preceding explorations, outlining the contributions of the developed methodologies to the field of cancer genomics and bioinformatics. It demonstrates the potential impact of DNA motif analysis and data integration on understanding phenotype heterogeneity in cancer and shows the prospective avenues it provides for impactful future research. Overall, this work is structured to contribute to the bioinformatics community by weaving together innovative tools and analyses focused on DNA motif analysis and data integration. It strives to pave a beneficial way forward to a deeper understanding of the cancer genome, thereby enhancing potential diagnostic and therapeutic strategies.


LCSH Subject Headings