Lung cancer, the most common cancer worldwide, is targeted with radiation therapy (RT) in nearly one-half of cases. RT planning is a manual, resource-intensive process that can take days to weeks to complete, and even highly trained physicians vary in their determinations of how much tissue to target with radiation. Furthermore, a shortage of radiation-oncology practitioners and clinics worldwide is expected to grow as cancer rates increase.
Brigham and Women’s Hospital researchers and collaborators, working under the Artificial Intelligence in Medicine Program of Mass General Brigham, developed and validated a deep learning algorithm that can identify and outline (“segment”) a non-small cell lung cancer (NSCLC) tumor on a computed tomography (CT) scan within seconds. Their research, published in Lancet Digital Health, also demonstrates that radiation oncologists using the algorithm in simulated clinics performed as well as physicians not using the algorithm, while working 65 percent more quickly.
“The biggest translation gap in AI applications to medicine is the failure to study how to use AI to improve human clinicians, and vice versa,” said corresponding author Raymond Mak, MD, of the Brigham’s Department of Radiation Oncology. “We’re studying how to make human-AI partnerships and collaborations that result in better outcomes for patients.
The benefits of this approach for patients include greater consistency in segmenting tumors and accelerated times to treatment. The clinician benefits include a reduction in mundane but difficult computer work, which can reduce burnout and increase the time they can spend with patients.”
The researchers used CT images from 787 patients to train their model to distinguish tumors from other tissues. They tested the algorithm’s performance using scans from over 1,300 patients from increasingly external datasets.
Developing and validating the algorithm involved close collaboration between data scientists and radiation oncologists. For example, when the researchers observed that the algorithm was incorrectly segmenting CT scans involving the lymph nodes, they retrained the model with more of these scans to improve its performance.
Finally, the researchers asked eight radiation oncologists to perform segmentation tasks as well as rate and edit segmentations produced by either another expert physician or the algorithm (they were not told which). There was no significant difference in performance between human-AI collaborations and human-produced (de novo) segmentations.
Intriguingly, physicians worked 65 percent faster and with 32 percent less variation when editing an AI-produced segmentation compared to a manually produced one, even though they were unaware of which one they were editing. They also rated the quality of AI-drawn segmentations more highly than the human expert-drawn segmentations in this blinded study.
Going forward, the researchers plan to combine this work with AI models they designed previously that can identify “organs at risk” of receiving undesired radiation during cancer treatment (such as the heart) and thereby exclude them from radiotherapy. They are continuing to study how physicians interact with AI to ensure that AI-partnerships help, rather than harm, clinical practice, and are developing a second, independent segmentation algorithm that can verify both human and AI-drawn segmentations.
“This study presents a novel evaluation strategy for AI models that emphasizes the importance of human-AI collaboration,” said co-author Hugo Aerts, PhD, of the Department of Radiation Oncology. “This is especially necessary because in silico (computer-modeled) evaluations can give different results than clinical evaluations. Our approach can help pave the way towards clinical deployment.”