Today's Clinical Lab - News, Editorial and Products for the Clinical Laboratory
A graphical representation of a hand holding a smart device and a human liver icon floating on it.
The researchers plan to advance their accuracy by incorporating clinical notes and tap into natural language processing.

Machine Learning Model Accurately Predicts Liver Cancer Risk

New algorithms developed by clinicians and data scientists could help personalize treatment

University of California - Davis Health System
Published:Feb 29, 2024
|3 min read
Register for free to listen to this article
Listen with Speechify

A team of University of California (UC) Davis Health clinicians and data scientists have developed a machine learning model to better predict patients’ risk of developing hepatocellular carcinoma (HCC).

The findings of their research—published in the Gastro Hep Advances—describe how predictive learning can help clinicians assess HCC risk early in patients with metabolic dysfunction-associated steatotic liver disease, or MASLD. The pilot technology may be able to give clinicians critical information to screen patients more closely and, thus, offer personalized care.

“MASLD can lead to HCC, but the disease is quite sneaky, and it’s often unclear which patients face that risk,” said study co-author Aniket Alurwar, MS, clinical informatics specialist at the UC Davis Center for Precision Medicine and Data Sciences. “It doesn’t make sense to biopsy every patient with MASLD, but if we can segment for risk, we can track those people more closely and perhaps catch HCC early.”

Diagnosing a stealthy condition

MASLD (formerly called nonalcoholic fatty liver disease or NAFLD), a condition often linked to metabolic diseases such as type 2 diabetes, is the accumulation of fat in the liver. Around 25 percent of Americans have some form of MASLD, making it one of the most common liver issues.

The data science team worked closely with clinicians, which included first author Souvik Sarkar, MD, PhD, assistant professor in gastroenterology and hepatology, and Frederick Meyers, MD, MACP, senior author and distinguished professor of internal medicine, hematology, and oncology. Meyer is also director of the Center for Precision Medicine and Data Sciences.

The study is one of the first of its kind. Researchers trained machine learning algorithms, which leveraged large datasets to make verifiable predictions. They tested nine open-source algorithms and shortlisted five for further evaluation and model building. They then taught the shortlisted algorithms to run de-identified health data from 1,561 UC Davis Health patients with MASLD, 227 of whom eventually developed HCC.

Later, these five algorithms were validated against data from 686 UC San Francisco patients, (also through de-identified medical records), with 176 getting diagnosed with HCC. An algorithm called Gradient Boosted Trees ultimately produced the prediction model with the greatest statistical accuracy, sensitivity, and specificity.

The study confirmed that one of the most reliable markers for HCC risk is advanced liver fibrosis or scarring, characterized by high fibrosis-4 (FIB-4) index scores. However, the researchers also found four additional risk factors associated with liver function: high cholesterol, hypertension, bilirubin, and alkaline phosphatase (ALP). A combination of those risk factors in one model helped predict HCC risk.

AI shows high accuracy

The team found there are multiple pathways to HCC, with high FIB-4 being the most obvious. In some cases, patients with low FIB-4 but high cholesterol, bilirubin, and hypertension also developed HCC. Under current guidelines, these patients would not receive precautionary care.

“We got 92.12 percent accuracy when predicting which MASLD patients would develop HCC, which is very good for a pilot model,” Alurwar said. “Patients with low FIB-4 are typically considered low risk and do not get referred for further assessment. By showing which of these ‘low risk’ patients could develop HCC, we can get them referred for liver biopsies or imaging.”

The researchers plan to advance their accuracy by incorporating more precise data, such as clinical notes, to tap another form of AI, called natural language processing, which translates written text into data. “We believe we can improve the algorithm by incorporating the clinical notes and perhaps other information,” said Alurwar. “Embedding this data should create an even more powerful model that we can then test to see how it performs.”

- This press release was originally published on the UC Davis Health website