Obstacles to Developing Tailored Normal Test Ranges
Large data sets hold promise, but major challenges remain
Common clinical laboratory tests are procedures wherein samples of blood, urine, other bodily fluid or tissue are checked to learn about a person’s health. Such tests are administered to discover the cause of symptoms, confirm a diagnosis, and to screen for disease. Information obtained from the test can also help to rule out alternative diagnoses or asses and monitor the progression of a disease and assist the clinician with a plan for treatment.
Test results show whether a person is within the normal lab values for a particular analyte or biomarker, but there are two crucial questions that clinicians should be asking: (i) How are the normal values determined? And (ii) is there a better way to define and use them in clinical practice?
DETERMINING NORMAL VALUES
According to the FDA, normal lab test values are generally given as a range, as normal values vary from person to person. The medical definition of normal range is one that encompasses 95 percent of values from a healthy population. The remaining five percent of results from a healthy population fall outside the normal range, as do any truly abnormal results.
The current paradigm for establishing reference intervals is for each laboratory to determine its own reference range for use with each test it offers. Reference ranges from published sources can be transferred for use in a clinical laboratory if the laboratory verifies those values. A laboratory can also use their current ranges, the manufacturer’s ranges, or locally established ranges as the baseline. The first step in the process is to establish the population to be used. A healthy population can be established using a health assessment or questionnaire, with a minimum of 20-25 samples required for the transference of reference ranges from an external source to an individual laboratory. However, the Clinical and Laboratory Standards Institute (CLSI) states that 120 reference individuals should be used to establish reference intervals for laboratory analytes.
In addition to the challenges recruiting a suitable cohort of normal, healthy subjects, laboratory methods and instrumentation are not standardized. Currently, there are efforts underway to reduce between-method variability. The production and adoption of reference materials and reference methods and the establishment of certified reference laboratories help to drive accurate patient results. Proficiency testing is also required for certification of laboratories by Clinical Laboratory Improvement Amendments of 1988 (CLIA) and the College of American Pathologists (CAP) accreditation programs.
Nonetheless, many tests results are dependent upon demographic factors (e.g. race, gender) or even the time of day when the sample is taken (e.g. diurnal variation in cortisol levels). Ideally, such variation would be factored into normal values.
MINING BIG DATA FOR MORE CLINICALLY RELEVANT NORMAL REFERENCE RANGES
Big data involves a high volume of high-velocity and variable data (the 3 Vs). Such data require new forms of processing to enable enhanced decision making. The new reality of cost containment and competition in laboratory medicine has led to the increasing use of high-throughput automated equipment that can rapidly generate large volumes of data. For example, a modern hospital or commercial laboratory can generate more than four to five million test results per year. A plethora of data are routinely collected on individuals being tested, including pre-analytical demographics (e.g. gender, race, height, and weight), and in many cases, medical history data, when and where a sample is drawn, transit time, processing time, analyzer time, and specimen integrity (hemolysis, icterus, and lipemia indices). In addition to the volume of data generated by laboratory testing, there has also been a considerable increase in the variety of data collected on a patient via images (e.g. X-rays, MRI), charts (e.g. EKG, EEG), or panels (e.g. DNA, multiplex immunoassays).
As clinical data accumulates in databases, potential arises for the use of big data analytics to determine whether a patient has test results consistent with a particular disease. So, is it feasible to use computer-directed analytics clinically to gauge the impact of multiple variables on an individual patient’s diagnosis, or the potential outcome of a treatment regimen?
With this question in mind, researchers proposed using analysis of big data obtained from large databases and Electronic Health Records (EHR) to define normal ranges tailored to an individual patient. They pointed out that there are relatively few systematic studies of reference range variation across populations with different demographics. Analysis of large databases may make a systematic analysis of test variation feasible, but at present, there are significant challenges to overcome.
Most importantly, there are few large databases available without significant biases in methodology and instrumentation to enable their optimal clinical use. Another challenge is that currently available EHR databases do not contain a defined population of healthy, normal subjects. There are also studies showing significant variation of reference ranges and analyte values obtained by individual laboratories on standardized samples sent to them by CLIA and CAP programs. Other challenges to big data-driven tailored normal ranges include the multiplicity of sub-populations within demographic groups and the sheer number of analytes that would need to be evaluated.
THE PATH FORWARD
The multiple impediments to developing tailored normal ranges seem insurmountable at this time. The results of clinical testing need to be harmonized, within clinically meaningful limits, to enable optimal disease diagnosis and patient management. Similarly, the standardization of EHR and other databases appears to be critical to advances in precision medicine. With the development of large harmonized databases, big data analysis may play a role in the evaluation of the normal ranges in the not-too-distant future.