MuBuCo: Mutation Burden Composition
Access full-text files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Gene mutations can vary by type, in terms of affecting a single site, spanning hundreds of base pairs, or their over/under-representation in cancer cells. Collectively, these mutation types include: single nucleotide variants (SNVs), structural variants (SVs), and copy number variants (CNVs). Tumor mutation burden is one measure widely used throughout cancer research but it is often limited to a single dimension, using SNVs only. We derive a sample mutation burden for each mutation type and combine them to define their relative contribution, forming a mutation burden composition (MuBuCo). We applied MuBuCo to multiple myeloma, a well-recognized, genomically heterogeneous blood cancer. Using 70 multiple myeloma cell lines, we first computationally assessed more than 15 bioinformatics tools to detect each type of variant and selected the best performing ones (using known features) to calibrate and characterize these cell lines. This required more than 10,500 node hours to run on Texas Advanced Computing Center clusters. We also developed a snakemake pipeline incorporating preprocessing and SNV calling. Each cell line’s variant calls were used to calculate each mutation type burden. We further defined expressed mutations by variants found in expressed genes to predict neoantigens. We implemented the results in a query-able application that enables cancer researchers to select MM cell lines of interest and visualize its MuBuCo relative to other cell lines. With this information, we hope to improve our understanding of the molecular background against which these cell lines are used for testing new treatments. We further provide an in silico look at changes in cell lines’ MuBuCo from user specified removal of a single or multiple genes, mimicking a ‘knock out’ experiment. Our application offers a novel mutation analyses whose results are not readily attainable, until now.