Pathogen Proteotyping: A Game Changer for Clinical Microbiology

How mass spectrometry can transform the diagnosis and treatment of infectious diseases

Raeesa Gupte, PhD

Raeesa Gupte, PhD, is a freelance medical and science writer and editor specializing in evidence-based medicine, neurological disorders, and translational diagnostics. She holds a PhD in pharmacology from The University...

ViewFull Profile
Learn about ourEditorial Policies.
Published:Jul 06, 2020
|Updated:Oct 30, 2020
|5 min read

The recent COVID-19 pandemic has pushed infectious diseases into the limelight, but infectious diseases have always been a global health concern. The World Health Organization (WHO) ranked infectious diseases of the lower respiratory tract, diarrheal diseases, and tuberculosis among the top ten causes of death worldwide in 2016.

The bacteria, viruses, fungi, and protozoa that cause infectious diseases often produce overlapping clinical symptoms that are not specific to an individual microbe. Consequently, traditional methods of diagnosis rely on culture and isolation of microbes from clinical samples, followed by their phenotypic and genotypic characterization. These methods are time consuming, requiring days to weeks before confirmatory results are obtained. Since bacterial infections may escalate into life-threatening conditions such as sepsis, physicians often prescribe broad-spectrum antibiotics before confirming the identity of the infecting microorganism. Such overuse of broad-spectrum antibiotics has led to the emergence of antibiotic resistance. Therefore, the ability to accurately and rapidly identify the disease-causing pathogen is crucial for treatment initiation. 

“Clinical laboratories are tapping into the high speed, low cost, and simplicity of mass spectrometry (MS) for pathogen proteotyping.”

Proteotyping refers to the use of protein markers for characterization, classification, and identification of microorganisms. Clinical laboratories are tapping into the high speed, low cost, and simplicity of mass spectrometry (MS) for pathogen proteotyping. Wide availability of microbial protein sequence databases and advances in bioinformatics and MS instrumentation are making proteotyping of bacteria and yeasts a routine procedure in clinical microbiology.1 

Mass spectrometry-based techniques for pathogen proteotyping

 Initially, extensive fragmentation due to harsh ionization techniques made MS suitable for identifying only small molecular weight compounds. Soft ionization techniques such as matrix-assisted laser desorption ionization (MALDI) and electrospray ionization (ESI) are now used to identify macromolecules including proteins.2 

In MALDI-time-of-flight (MALDI-TOF) MS, microorganisms cultured from clinical samples are applied to a target plate and embedded within a matrix. When a laser is focused onto the sample, the matrix absorbs most of the energy and prevents excessive fragmentation of microbial proteins. In ESI-MS, high voltage is applied to the sample as it is eluted from a capillary tip. This produces a fine mist of charged droplets. Heat is used to vaporize the solvent and the charged analytes are directed to the mass analyzer. The mass spectra obtained by MALDI-TOF or ESI-MS are then compared against a known reference database for sample identification. 

Clinical applications of pathogen proteotyping

 Identification of pathogen species and strains 

Commercial and research use only (RUO) MS systems are available for identification of numerous infectious agents. Two FDA-cleared commercial systems are available for use in clinical laboratories. The Bruker Biotyper CA system is capable of identifying gram-positive and gram-negative bacteria, anaerobic bacteria, Enterobacteriaceae, and yeasts.1 The RUO database for this system can also identify fungi and mycobacteria. The VITEK MS system developed by bioMérieux is FDA-cleared for detection of gram-positive and gram-negative bacteria, yeast, mycobacteria, nocardia, and mold. 

Both commercial systems use MALDI-TOF to identify organisms at the genus and species level. Ribosomal proteins and other abundant housekeeping or structural proteins are used for microbial identification and characterization. The limited number of mass spectra generated from abundant proteins hinders detection of subtle differences that are present between sub-species. The higher resolution of tandem MS/MS techniques is better suited for differentiation of strains of the same microbial species. For example, LC-MS/MS was used to distinguish between probiotic and pathogenic strains of Escherichia coli based on unique differences in metabolite biosynthesis.2 Compared to MALDI-TOF, LC-MS/MS also had greater success in discriminating between strains of the nosocomial pathogen Acinetobacter baumannii. 4 

The ability to identify strains with enhanced pathogenicity has important implications for studying the epidemiology of infectious diseases. In addition, detecting differences in virulence and antibiotic resistance between strains of the same pathogenic species can help inform treatment decisions. 

Research into host-pathogen interactions 

Infection occurs in three stages: entry of a pathogen into a host cell, replication of the pathogen, and spread of the pathogen to other cells or hosts. Most viruses contain a protein envelope, called a capsid, to protect their genetic material. The envelope proteins determine the structure, stability, and infectivity of mature virions. As they infect and replicate, pathogens also interact with and acquire several proteins from their host cells. 

Various proteins that make up the viral capsid may be identified by top down or bottom up proteomic approaches. The top down approach uses MALDI-TOF or ESI to identify intact proteins. The bottom up approach proteolytically cleaves proteins into peptides, sorts them by size, and then uses tandem MS/MS to generate mass spectra. LC-MS/MS or gel electrophoresis followed by ESI-MS/MS have previously been used to identify the spike, membrane, nucleocapsid, and envelope proteins of SARS-CoV.5 LC-MS/MS was also used to identify viral and host proteins incorporated into HIV-1 virions.6 

Post-translational modifications such as glycosylation play an important role in protein folding, viral tropism, and immune evasion. Recently, glycosylation sites on the spike protein of SARS-CoV-2 were characterized using LC-MS.7 An understanding of host-pathogen interactions provides insights into mechanisms of infection, replication, and transmission that can be used to develop drugs and vaccines. 

Limitations and future directions

A major limitation of pathogen proteotyping using MALDI-TOF MS is that identification of pathogenic organisms is possible only if the reference database contains peptide mass fingerprints of the specific genus, species, or subspecies. In addition, the databases of commercial systems are proprietary and cannot be updated by the user. Therefore, laboratories need to construct their own reference database on RUO systems for endemic pathogens and query these along with commercial databases. Since MALDI-TOF only measures a small number of abundant proteins, it produces low-resolution spectra that cannot distinguish between closely related microbial species. Lastly, direct testing of clinical samples is not feasible using MALDI-TOF because of its low sensitivity. 

Bottom up approaches involving proteolytic digestion provide higher resolution and sequence coverage than top down approaches. They also provide rapid, high-throughput analysis of subtle differences between strains. However, the harsher ionization techniques involved may hamper detection of post-translational modifications involved in host-pathogen interactions. 

Coupling multiplex PCR with ESI-MS combines the power of genomics and proteomics. Although not routinely used in infectious disease diagnostics yet, this combined approach has the potential to screen large panels of suspected pathogens from microbial isolates or directly from clinical samples. The ability to analyze multiple organisms and multiple strains with high accuracy in a short duration of time is valuable in clinical and public health settings. 


1. Heaton, Phillip, and Robin Patel. "Mass spectrometry applications in infectious disease and pathogens identification." Principles and Applications of Clinical Mass Spectrometry. Elsevier, 2018. 93-114. 

2. Karlsson, Roger, et al. "Proteotyping: Tandem mass spectrometry shotgun proteomic characterization and typing of pathogenic microorganisms." MALDI-TOF and Tandem MS for Clinical Microbiology (2017): 419-450. 

3. Smith, David, et al. "Substantial extracellular metabolic differences found between phylogenetically closely related probiotic and pathogenic strains of Escherichia coli." Frontiers in Microbiology 10 (2019): 252. 

4. Wang, Honghui, et al. "A novel peptidomic approach to strain typing of clinical Acinetobacter baumannii isolates using mass spectrometry." Clinical Chemistry 62.6 (2016): 866-875. 

5. Zeng, Rong, et al. "Proteomic analysis of SARS associated coronavirus using two-dimensional liquid chromatography mass spectrometry and one-dimensional sodium dodecyl sulfatepolyacrylamide gel electrophoresis followed by mass spectrometric analysis." Journal of Proteome Research 3.3 (2004): 549-555. 

6. Chertova, Elena, et al. "Proteomic and biochemical analysis of purified human immunodeficiency virus type 1 produced from infected monocyte-derived macrophages." Journal of Virology 80.18 (2006): 9039-9052. 

7. Watanabe, Yasunori, et al. "Site-specific glycan analysis of the SARS-CoV-2 spike." Science (2020)

Raeesa Gupte, PhD

Raeesa Gupte, PhD, is a freelance medical and science writer and editor specializing in evidence-based medicine, neurological disorders, and translational diagnostics. She holds a PhD in pharmacology from The University of Iowa.