The American Society for Mass Spectrometry and Allied Topics 69th conference took place last fall in beautiful Philadelphia, PA, from October 31 to November 4. Though the pandemic limited the number of scientists who could attend in-person, the lack of crowds fostered interdisciplinary discussions during talks and poster presentations covering new instruments, and bioinformatics solutions.
Advances in mass spectrometry aim to streamline biologics analysis
SCIEX presented an interesting new way to use electron activated dissociation (EAD) fragmentation. SCIEX reported that their EAD approach has the capacity to detect and quantify up to 40 percent more proteins than methods using collision-induced dissociation. These dissociation methods are used to induce fragmentation along the peptide backbone to yield charged fragments whose charge-to-mass ratio can be used for peptide identification.
EAD can generate specific ions, known as z-ions, that are particularly useful for discerning between isoleucine (I) and leucine (L) residues, which have the same-mass. I/L elucidation is critical for MS analysis of antibodies because their antigen-binding regions are often peppered with I and L residues. Incorrect identification of these residues can result in altered antibody binding, and even altered immunogenicity, which would drastically impact the performance and safety of antibody therapeutics.
In addition to the sequence, post-translational modifications (PTMs), and specifically glycosylation, can also affect the function of antibody therapeutics. In their poster, SCIEX showed that EAD can be used for both sequence elucidation and glycosylation analysis. Being able to simultaneously extract the latter information is quite useful because of the complexity of glycan diversity and the nature of current bottom-up workflows.
Data produced from this advanced technique can immediately inform downstream clinical efficacy and safety of antibody biologics. Thus, SCIEX positioned its new instrument, ZenoTOF 7600, which employs EAD, as an indispensable tool for biologics quality control and certification on the road to regulatory approval. Such an approach to EAD is an exciting technological advance that may prove quite useful for bridging gaps in developing biologics for clinical use, as noted in SCIEX’s poster, where they analyzed biologics in collaboration with Janssen.
Spectra predictors offer insight into the world of immunopeptidomics
Aside from advances in instruments for biologics analysis, many studies presented bioinformatics solutions to address clinical proteomics challenges, particularly in the emerging field of immunopeptidomics.
Immunopeptidomics focuses on profiling major histocompatibility complex (MHC)-binding peptides for clinical cancer research. Most efforts have been centered around cancer because as cancer cells mutate, their MHCs present peptides relevant to these mutations on the cell surface; these surface antigens represent great targets for disease treatment and screening.
Beyond mutations, PTM alterations have also been linked to cancer development. Hence, immunopeptidomics has become increasingly relevant for biomarker and novel target discovery for developing cancer therapeutics and diagnostics.
But immunopeptidomics is plagued by unique challenges not typically faced by conventional clinical proteomics. Immunopeptides are generated through unique splicing mechanisms caused by high mutation rates. These splicing events result in highly heterogenic peptides in both size and structure, which produce, copious raw data that are in need of accurate assessment. Two of the main proposed solutions at ASMS were deep learning algorithms well suited for large scale data analysis, and de novo sequencing developed to extract information directly from MS2 spectra to elucidate the primary structure without relying on databases, thus avoiding homology bias.
Matthias Wilhelm, PhD, and colleagues from the Technical University of Munich, Germany, offered a deep learning spectra predictor model, Prosit, for immunopeptide discovery. Unlike other deep learning approaches presented at ASMS, Prosit was not only trained on database data from leukocyte antigen (HLA) immunopeptides (human MHC cluster of genes), but also from spectra data generated from their own custom-built synthetic library.
Altogether, Prosit trained on data from more than 300,000 peptides and more than 98 percent of human HLA genes. After this extensive training, Prosit was benchmarked against other leading protein identification software in the market (e.g., Mascot, MaxQuant, and MSFragger). The researchers found that compared to Prosit, all three of these tools failed to accurately identify 87 percent of the immunopeptides analyzed.
Aside from having another means of validating the data, knowing the sequence of their synthetic library allowed the authors to note the presence of isobaric amino acids, which were inaccurately predicted by MaxQuant and MSFragger. As MaxQuant and MSFragger do not support de novo sequencing, the authors posited that misidentification could be due to homology bias, i.e., assigning different amino acid(s) because they share the same mass. This is a defining hurdle in the analysis of databases, which Wilhelm et al. tried to elegantly circumvent by generating a synthetic library, and which de novo sequencing was designed to tackle.
De novo sequencing also shows great promise for immunopeptidomics
Hoffman and colleagues presented data illustrating the power of de novo sequencing-based identification. Their approach—described in a BioRxiV preprint—involved de novo sequencing of purified HLA-I peptides in parallel to homology-based searches in the machine learning-based NetMHCpan-4.1 server. De novo sequencing alone identified half of the peptides from colorectal cancer samples, in addition to detecting extra peptides from database searches.
Hoffman and colleagues also reported that regardless of the sample type analyzed—normal or tumor—the most abundant (75 percent) peptide size found was nine amino acids long. In another study published in the Journal of Immunology, Jurtz et al. showed that depending on the sample, 8-mers could vary from 2 percent to more than 20 percent, and 9-mers from 30 percent to nearly 75 percent, which Hoffman and colleagues reported as characteristic of HLA-I peptides.
Another report by Gfeller et al. published three years ago in the Journal of Immunology also found that 8- and 9-mers often contained the isobaric amino acids isoleucine and leucine in contrast to longer immunopeptides. This finding may explain why Hoffman and colleagues’ de novo sequencing was so effective at correctly identifying most of the peptides in the colorectal sample, and why Prosit detected homology-based bias in MaxQuant and other software.
Gfeller and colleagues noted that HLA-I ligands often bind both 8- and 9-mer immunopeptides, but that 8-mers are loaded at lesser frequencies into the endoplasmic reticulum. Chong et al. also reported that treatment (e.g., interferon) may impact immunopeptide size distribution in the cell (i.e., the frequency at which these peptides may be produced and/or loaded). Altogether, this suggests that less abundant but still treatment- or diagnostic-relevant immunopeptides may not be identified by the MS instrument—an all-too-common blind spot.
Still these methodologies show great potential for future identification of cancer biomarkers and therapeutics. Overall, this year’s ASMS conference was a great reminder of the resiliency and innovative spirit of the scientific community, which despite the current circumstances, continues to deliver promising breakthroughs for the analysis of biologics, and for the discovery of novel biomarkers for clinical work.