AI in Cancer Detection: Are We There Yet?

In 2015, 17.5 million cancer cases were reported worldwide—a 33 percent increase from 2005.

For most cancers, early diagnosis is key to improving survival rates. Radiology and histopathological images have long been the cornerstones in cancer detection and staging. Given the growing number of cancer cases, radiologists and pathologists need to analyze high volumes of imaging data. By some estimates, an average radiologist may have to analyze one image every three to four seconds to meet the demands of their daily workload. Such pressures may lead to burnout and misdiagnoses that harm both the physician and the patient. In addition to being time and labor-intensive, manual analyses currently used in clinical practice suffer from low reproducibility and high inter-rater variability due to subjectivity of the measures. Therefore, solutions that would improve both the efficiency and accuracy of cancer diagnosis are the need of the hour. With applications ranging from facial recognition to language processing, artificial intelligence (AI) offers the potential to revolutionize healthcare. The major reason cancer diagnosis is primed for AI is big data—large datasets that the algorithms can use to “learn.”

Human versus machine

Machine learning algorithms have been designed to identify and characterize specific types of cancers. Retrospective studies suggest that these algorithms can perform at par with oncology experts, and may even transcend them.

A recent study found the accuracy of an AI system to independently detect breast cancer from digital mammograms was comparable to the average of 101 radiologists. In a study evaluating lymph node metastases of breast cancer, accuracy of AI was significantly better than a panel of pathologists analyzing the slides under a time constraint, but was comparable to a pathologist operating without time constraints. Similarly, Google’s lymph node assistant, or LYNA, showed over 99 percent accuracy on a test dataset of lymph node metastasis graded by two board-certified pathologists, indicating performance comparable to pathologists. In addition, LYNA significantly increased the ability of pathologists to detect micrometastases, while reducing the time spent on reviewing each image. These studies suggest that using AI to highlight regions of interest during histopathological review can improve accuracy in a time-limited clinical setting.

Using deep learning, researchers trained an algorithm to distinguish normal lung tissue from two common types of lung cancer—adenocarcinoma and squamous cell carcinoma. The program also correctly classified over 80 percent of images that at least one of three pathologists in the study misclassified, suggesting that AI could offer a useful second opinion and limit the rate of cancer misdiagnosis. Furthermore, the algorithm could reliably predict six commonly observed genetic mutations in lung adenocarcinoma without DNA sequencing, potentially enabling early initiation of targeted therapy.

AI can also diagnose skin, cervical, and brain cancer, with some studies suggesting that algorithms may outperform clinicians. Google’s Inception v4 neural network diagnosed dermoscopic images with significantly higher accuracy, sensitivity, and specificity than the average performance of 58 dermatologists. A machine learning algorithm for cervical cancer screening surpassed colposcopists and conventional cytology in detecting pre-cancer abnormalities. In a feasibility study on MR images from a small cohort of brain tumor patients, a machine learning algorithm outperformed two expert neuroradiologists in distinguishing radiation-induced necrosis from recurrent tumors. More recently, neural networks were able to predict the genetic mutations in gliomas, based on certain key features of MR images, with 83-94 percent accuracy without human supervision.

Real-world applications

With several studies retrospectively demonstrating the feasibility of AI in cancer diagnostics, regulatory authorities are ushering the technology into the real world with cautious optimism.

Early last year, Arterys’ Oncology AI Suite was approved for radiologic detection of liver and lung cancer in the US, Europe, and Canada. The software automates the segmentation of lung nodules and liver lesions. Tumor segmentation is used to characterize and distinguish malignant tumors from benign ones, playing a critical role in treatment decisions. Clinicians have the ability to edit automated segmentations on the Oncology AI Suite, giving them final control over diagnostic and therapeutic decisions.

The UK-based startup Kheiron Medical Technologies' deep learning radiology software was approved for use in the UK and European health care systems last year. The software will be used as a diagnostic aid for mammographic images in breast cancer screening. The company is now seeking FDA approval for the US market.

Recently, computational pathology startup Paige.AI received breakthrough device designation for their AI technology that aims to use deep learning to diagnose and classify several cancer types based on digitized histopathology slides licensed from Memorial Sloan Kettering Cancer Center. Breakthrough device designation is granted to technologies with the potential to provide significant advantages over existing standards for diagnosis or treatment of serious medical conditions.

Some technologies are being marketed to consumers prior to regulatory approval. Smartphone applications, such as SkinVision, are using machine learning-based image analysis to assist in early detection of melanoma. Their algorithms use the size, shape, and color of skin lesions to predict melanoma risk. With 80 percent sensitivity and 78 percent specificity in detecting malignant conditions, the app was considered to be less accurate than a trained dermatologist.

Limitations and ethical considerations

Despite millions of dollars being pumped into the prospect of integrating AI into the health care system, the technology is not without limitations. Firstly, an algorithm is only as good as the data from which it learns. Machine learning algorithms may develop racial, socioeconomic, or gender biases if they have not been trained on a diverse set of data. For instance, IBM’s AI-based clinical decision support system, Watson for Oncology, was trained on a US dataset but is increasingly used in Asia. Its treatment recommendations therefore vary in concordance from 49 percent to 83 percent from experts’ recommendations in Korea, China, Thailand, and India. A second limitation is the black box problem, wherein it is not always clear how an AI system learned to reach a particular output. This takes away autonomy from physicians who may be wary of trusting a system they do not fully understand. Thirdly, skepticism may stem from lack of clinical validation. Although several studies have demonstrated feasibility, the generalizability of AI systems has not been tested on large, heterogeneous, multi-center datasets. Lastly, clinicians grapple with ethical considerations involving protection of patient privacy and shouldering the blame in case of AI error.

AI-based cancer diagnostics show great promise in rapidly identifying certain types of cancers with high accuracy and specificity. However, for any new technology to be widely accepted it needs to be comparable to, if not better than, the existing standard of care. Currently, AI may only serve as a diagnostic aid in clinical decision-making. Without randomized clinical trials comparing AI with physicians on compliance with guidelines, improvements in patient outcomes, or cost savings, autonomous AI-powered cancer diagnostics remain a gleaming hope for the future.