DNA sequencing has been successfully used to link variations in an individual’s genome with a wide range of medical conditions. In many cases, this genetic information has transformed clinical practice. For example, in oncology, sequencing of genes linked to cancer risk, such as BRCA1/2, has helped focus disease surveillance and early intervention to where it is most impactful. More advanced sequencing of tumor tissue is now approaching “standard of care” status for numerous cancers, and is recommended in many treatment guidelines, including NCCN Guidelines.
Meanwhile, other potential applications of DNA sequencing remain in development. In particular, the rapidly evolving field of immune sequencing is poised to have a wide-ranging impact to clinical practice—including for the diagnosis and treatment of autoimmune and infectious disease and cancer. Innovative methods, such as high-throughput sequencing and protein structure prediction, are bringing immune sequencing techniques ever closer to the clinic.
Immune function determines human health
The clinical applications of immune sequencing are especially wide-reaching because human health is so intertwined with immune function. Largely, health relies on the immune system’s ability to accurately recognize and eliminate pathogens and dysregulated cells. So, it follows that sequence-level insight into the immune response has clinical applications for infectious disease and cancer.
Additionally, the immune system must regulate itself to avoid autoimmune-mediated damage to normal tissues. As a result, immune sequencing also has the potential to improve the diagnosis and treatment of autoimmune disease. Already, identification of the specific sequences involved in an immune response has been successfully used in research settings to elucidate the role of immune dysregulation in conditions such as type 1 diabetes, rheumatoid arthritis, multiple sclerosis, Grave’s disease, Crohn’s disease, and many others.
The immune system relies on immense genetic diversity
While research has elucidated the role of immune dysregulation in a growing number of conditions, immune sequencing is not commonly used in clinical practice. The immune system’s inherent diversity (more than a trillion unique genetic sequences are possible), prevents direct comparison to a reference sequence, as is typical for many DNA- and RNA-sequencing applications.
This diversity appears in the immune repertoire of receptors expressed on T and B cells (lymphocytes), which coordinate the adaptive immune response. Each member of the vast repertoire of lymphocyte receptors is encoded by a unique combination of gene segments—including variable (V), diversity (D), and joint (J) gene segments—which results in at least a trillion (1012) unique receptors. The diversity of B-cell receptors is further expanded by somatic hypermutation following antigen exposure.
As a result of this genetic diversity, lymphocyte receptors are able to recognize a great variety of antigens specific to bacteria, viruses, parasites, and cancerous cells, allowing T and B cells to effectively target the wide range of pathogens an individual may encounter over their lifetime.
Sequencing the genetic diversity of T-cell and B-cell receptors
Advancements in high-throughput sequencing methods have now made it possible to sequence the genetic diversity of the immune repertoire and identify the specific sequences involved in an immune response. Sequencers capable of producing several reads long enough to span entire immune regions are now available from Illumina, PacBio, and Oxford Nanopore. A wide range of laboratory approaches have been developed for T-cell and B-cell receptor library preparation for these sequencing platforms.
When choosing a laboratory method to selectively profile the immune repertoire, several considerations should be taken into account, including sample input type (DNA or RNA), target nucleotide length (most variable region only or full-length receptor), and level of diversity in clinical application (more reads are needed to assess broad diversity vs. fewer reads may be more efficient for identifying only the most abundant subset of sequences). For example, sequencing to identify the antibodies induced by vaccination may require fewer reads than an application searching for rare sequences in a more complex repertoire. Each sequencing method has its own advantages and limitations.
Since the sequence of each T-cell or B-cell receptor is unique, analysis and interpretation of the sequencing results require specialized techniques that differ from traditional sequencing applications. Reads must be assembled, grouped, and classified according to which V, D, and J regions have been used. Single-base modifications can dramatically change binding energies, making base accuracy important. For RNA methods, an accurate assessment of relative abundance requires reliable identification of duplicate molecules introduced during library prep. After processing, full-length sequences can be further analyzed.
Recent advances in ab initio protein structure prediction from sequence (e.g., AlphaFold Protein Structure Database) have fueled the prediction of antibody structures (e.g., IgFold developed by researchers at Johns Hopkins University), expanding the possibility of accurately linking immune repertoire sequences with their targets.
Realizing the potential of immune repertoire sequencing
Immune repertoire sequencing is an evolving field with diverse and innovative wet lab approaches, sequencing methods, and tools for data interpretation. Though social, regulatory, and other challenges must be overcome before the clinical application of these techniques, their vast potential to improve biomarker identification, diagnosis and treatment of cancer and infectious and autoimmune disease, and more make immune repertoire sequencing impossible to ignore.