Development of computational methods for immune repertoire analysis : from sequence to specificity
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The immune system plays a key role in maintaining human health. Accurately characterizing the immune receptors with immune repertoire sequencing (IRseq) provides an essential way for understanding the adaptive immune system. Towards this goal, we developed a bioinformatics tool, Molecular Identifier Clustering-based IR-Seq (MIDICRS), to quantitatively measure immune repertoire. We have demonstrated MIDCIRS’ accuracy, high coverage and wide dynamic range, which allow us to analyze various types of immune repertoires. Immune repertoire is continuously shaped by encountered antigens; thus, its components reflect an individual’s historical disease status. We applied MIDCIRS to measure the antibody repertoire from malaria-experienced individuals and found unexpected mutable capability of infants adaptive immune system. We also used MIDCIRS to measure Follicular helper T cells (Tfhs) directly obtained from untreated HIV patients’ lymph nodes and found (1) evidence for intact antigen-driven clonal expansion of Tfh cells and (2) selective utilization of specific complementarity-determining region 3 (CDR3) motifs during chronic HIV infection. Both studies demonstrated MIDCIRS functionality and versatility for studying antigen driven immune response. Bridging the gap between immune receptor sequences and their biological function (i.e. antigen specificity) is attractive and useful for directly measuring immune repertoire changes with respect to pathogen infection. Using experimentally validated CD8+ TCR sequences with their antigen specificity, we developed a computational tool, Linear programming based Motif Pick and Enrichment analysis for Tcrs (LiMPETs), to find significant motifs within the TCR CDR3 region for determining antigen specificity. We demonstrated LiMPETs’ advantages by comparing with existing tools on both public and in-house data