My laboratory at Children’s Mercy Kansas City focuses on identifying the genetic determinants of diseases, primarily in children with rare diseases, which are difficult to diagnose or detect using routine methods.
Some of the patients have neurodevelopment disorders, some have rare susceptibility to infection, and some have congenital heart disease, renal disease, hearing loss, or blindness. Many of the diseases we encounter in our research program are so rare that there are only a handful of cases in the US or a few dozen globally.
When a case gets referred to our lab, it’s because common tools for diagnosing rare disease, namely clinical examination and non-molecular diagnostic tests, or even molecular tests—such as microarrays, short-read sequencing-based exome sequencing, and genome sequencing—have failed to yield a result. In our lab, we employ a gamut of genomic and multiomic tools to determine the pathogenicity of a mutation or to identify a genetic cause of disease missed by prior testing. These tools include RNA sequencing, short- and long-read genome sequencing, single-cell sequencing, chromatin sequencing, etc.
Blind spots of genome sequencing methods
Short-read sequencing has greatly increased diagnostic yield over microarrays. However, the nature of this method has many blind spots when it comes to identifying the gene responsible for a disease. For example, repeat expansions, responsible for diseases like myotonic dystrophy, can be very difficult to identify with short-read sequencing. Similarly, about three percent of disease-causing genes—such as those responsible for spinal muscular atrophy—have closely related DNA sequences elsewhere in the genome, making them inaccessible to this method. And structural variants can be hard to identify with short-read sequencing due to the lack of context.
Long-read sequencing can help overcome these blind spots by easily showing long repeat expansions and accurately mapping genes within proper genomic context.
But DNA sequencing alone cannot uncover the basis of all rare diseases. For example, DNA methylation data is the only way to identify the cause of imprinting disorders like Beckwith-Wiedemann or Angelman syndrome. Most clinical laboratories don’t look at methylation data and, therefore, are unlikely to see this pattern.
The power of worldwide data sharing and collaboration
Rare diseases may be genetic in origin but involve multiple gene mutations and/or interactions. Identifying multifactorial gene combinations can be extremely difficult because a lack of bioinformatic tools adds to the rarity of these conditions.
However, in a study of a cohort of 500 children suspected of rare diseases, multiomic techniques helped identify the genetic cause of a condition in roughly 5 percent of cases that went undiagnosed following short-read sequencing.
Improving the diagnosis of rare diseases warrants extensive worldwide data sharing and collaboration. With better reference data, we would be able to robustly and accurately screen and detect genetic causes of rare diseases. We’re hopeful that as the clinical community continues to leverage more long-read and multiomic approaches, the success rate of treating rare diseases will improve.