Methods for proteomic characterization of antibody repertoires and de novo peptide sequencing
MetadataShow full item record
Driven by the increased performance and availability of protein mass spectrometry and next generation sequencing technologies, research in proteomics and systems biology has expanded far beyond the study of model organisms. This heralds a deeper understanding of biology, the world, and human health. However, it also brings significant new challenges to the interpretation of sequencing and mass spectrometry data, the current software tools ill-suited for many modern studies. The first half of this dissertation explores some of these challenges and solutions in the context of a particularly demanding domain – that of serological antibody proteomics. Our team has developed a combined sequencing and proteomics approach for profiling the human serum antibody repertoire. This opens an unprecedented view into the nature of the adaptive immune system and provides insight on antibody repertoire dynamics in both health and disease. The platform also provides effective means to evaluate vaccine efficacy and identify potential antibody therapeutics. Chapter 1 reviews recent advances in and results from such molecular level characterization of the serum antibody repertoire. Detailed in the second chapter, challenges specific to antibody repertoire proteomics preclude the use of standard analysis methods and motivated our development of novel tools and approaches for interpreting serum repertoire proteomic data. I will shift focus in chapters 3 and 4 to present an experimental and computational workflow for accurate and full-length de novo peptide sequencing. We applied 351 nm ultraviolet photodissociation (UVPD) on chromophore-tagged peptides and developed software for sequencing the resultant UVPD mass spectra. Improvements described in chapter 4 enable the software to automatically learn from and interpret new types and combinations of spectra from the same precursor peptide. We demonstrate the effectiveness of this machine learning framework on CID/UVPD spectral pairs and obtain results, from low resolution spectra, comparable to current state of the art. Continued development of these de novo interpretation and sequencing methods, in part or in whole, may sidestep many of the remaining challenges facing repertoire proteomics, and successful application of these efforts promises further advancement in antibody repertoire characterization and understanding.