Emerging research suggests that artificial intelligence (AI) may enhance breast cancer detection in patients undergoing sequential screening with digital breast tomosynthesis (DBT).
For the retrospective study, recently published in Radiology: Artificial Intelligence, researchers assessed the use of a DBT-based AI software (ProFound AI v2.0, iCAD) for 1,799 women (mean age of 58.1) who had one-year follow-up data after having two or more DBT-AI screenings. The study authors noted that DBT-AI case scores of > 70 and case score changes of > 25 were considered “noteworthy.”
The researchers found that true positive examinations had the highest average DBT-AI case score at 75 in contrast to 42 for false positive exams, 37 for false negative exams and 34 for true negative exams.
True positive exams also had the largest mean DBT-AI score change (21.1) with sequential screenings in comparison to 9.23 for false positive exams, 0.73 for true negative exams and – 0.17 for false negative exams, according to the study authors.
“While all studies currently require radiologist interpretation regardless of change in score, our preliminary review suggests that high case score and/or case score changes should be further scrutinized to maximize screening performance. Our results show that utilizing sequential case score changes along with radiologists’ interpretation may improve patient outcomes,” wrote lead study author Samantha P. Zuckerman, M.D., who is affiliated with the Department of Radiology at the Hospital of the University of Pennsylvania in Philadelphia, Pa., and colleagues.
The researchers employed DBT-AI case score and case score change thresholds of 26 and +1, respectively, for cancer detection. For the DBT-AI case score cutoff, the researchers noted an 89.4 percent sensitivity rate and a 42.9 percent specificity.
“In 16 of the 18 TP cancer cases where there was a noteworthy case score change between the current and prior screening examination (≥ 25), the second screen case score was over 70, indicating that the identified abnormality had a relatively higher likelihood of representing malignancy according to the DBT-AI algorithm,” pointed out Zuckerman and colleagues.
For Related Content
1. AI-enhanced screening performance. Utilizing artificial intelligence (ProFound AI v2.0) in sequential digital breast tomosynthesis (DBT) screenings can improve breast cancer detection accuracy, particularly when focusing on high case scores (>70) and significant case score changes (>25).
2. Enhanced sensitivity and specificity. Combining DBT-AI case scores and case score changes improved sensitivity (93.6 percent) and specificity (62.8 percent), aiding radiologists in more informed recall decisions during breast cancer screening.
3. Score changes as Indicators. Significant case score changes between screenings (≥25) paired with a high second-screen case score (>70) indicated a higher likelihood of malignancy in 16 of 18 true positive cancer cases, suggesting a valuable role for AI in identifying abnormalities over time.
However, when the study authors combined the thresholds for case scores and case score changes, they noted sensitivity and specificity rates of 93.6 percent and 62.8 percent respectively.
“Using the combination of DBT-AI case score with change in case score over time may help radiologists make recall decisions in DBT screening,” posited Zuckerman and colleagues.
(Editor’s note: For related content, see “Mammography Study Shows Merits of AI for Improving Breast Cancer Detection and Effectiveness of Recalls,” “Mammography Study Suggests DBT-Based AI May Help Reduce Disparities with Breast Cancer Screening” and “Comparing Digital Breast Tomosynthesis to Digital Mammography: What a Long-Term Study Reveals.”)
Beyond the inherent limitations of a single-center retrospective study, the authors acknowledged that results from their assessment of single vendor DBT and AI modalities may not be broadly applicable to other DBT platforms and AI software. They also noted a lack of one-year follow-up data and conceded that the standards for case score thresholds were based on clinical experience.