Findings from an international multicenter study suggest that artificial intelligence (AI) models for assessing ultrasound images for potential cases of ovarian cancer may facilitate a nearly 40 percent reduction in the false negative rate (FNR) in contrast to unassisted expert ultrasound interpreters and over a 65 percent reduction in comparison to non-expert clinicians.
For the retrospective study, recently published in Nature Medicine, researchers utilized 17,119 ultrasound images drawn from 3,652 patients and 20 facilities in eight countries to develop and validate transformer-based neural network models for detecting ovarian cancer. While 992 cases were employed for supplementary training, the study authors compared the AI models to seven expert ultrasound interpreters and six non-expert clinicians for 2,660 cases.
The researchers found that the use of adjunctive AI models led to enhanced sensitivity (85.99 percent vs. 82.4 percent), specificity (86.29 percent vs. 82.67 percent) and accuracy (86.2 percent vs. 82.63 percent) in contrast to expert reviewers of ultrasound imaging for differentiating between benign and malignant ovarian lesions.
In comparison to unassisted non-expert clinicians, the AI models demonstrated greater than seven percent increases in sensitivity (85.81 percent vs. 78.71 percent), specificity (85.08 percent vs. 77.27 percent), and accuracy (85.38 percent vs. 77.67 percent) for ultrasound detection of ovarian cancer.
“Our study demonstrates the potential of AI models in improving the accuracy and efficiency of ovarian cancer diagnosis. Our models demonstrated robust generalization and significantly outperformed both expert and non-expert examiners on all evaluated metrics,” wrote lead study author Filip Christiansen, Ph.D., who is affiliated with the Department of Clinical Science and Education and the Department of Obstetrics and Gynecology at the Karolinska Institute in Stockholm, Sweden, and colleagues.
Additionally, when researchers assessed the AI models with a specificity threshold of 82.67 percent, they found a 39.27 percent FNR reduction in in comparison to unassisted expert ultrasound reviewers and a 65.37 FNR reduction in contrast to unassisted non-expert clinicians.
Three Key Takeaways
1. Significant reduction in false negatives. The AI models achieved a nearly 40 percent reduction in the false negative rate compared to expert reviewers and over a 65 percent reduction compared to non-expert clinicians, enhancing ovarian cancer detection reliability.
2. Improved diagnostic accuracy. AI-assisted interpretation outperformed human experts and non-experts in sensitivity, specificity, and overall accuracy, demonstrating its potential to augment diagnostic precision in ovarian cancer assessments.
3. Potential for workflow enhancement and health equity. The AI models maintained high performance even in challenging cases and reduced non-expert referrals to specialists by 63 percent, emphasizing its role in improving diagnostic equity and efficiency, particularly in resource-limited settings.
Further evaluation of the AI models as second readers in a triage simulation found that adjunctive AI offered improved detection with an F1 percentage of 82.70 percent vs. 77.16 percent for unassisted non-expert clinicians. Second reads from the AI models also facilitated a 63 percent reduction in non-expert clinician referrals to expert ultrasound readers, according to the study authors.
“This finding is especially vital given the scarcity of expert examiners, underlining AI’s potential for advancing equitable access to high-quality diagnostic services. In contrast to human examiners, the AI models maintained high performance even in cases where human examiners were uncertain. This suggests that AI-driven diagnostic support may have a particularly important role in cases that are difficult to classify by human examiners,” maintained Christiansen and colleagues.
(Editor’s note: For related content, see “Can Deep Learning Ultrasound Assessment be a Viable Option for Diagnosing Ovarian Cancer?,” “Consensus Recommendations on MRI, CT and PET/CT for Ovarian and Colorectal Cancer Peritoneal Metastases” and “Is MRI More Effective than Ultrasound for Diagnosing Adnexal Lesions?”)
Beyond the inherent limitations of a retrospective study, the authors noted that reviewing clinicians based their case assessments interpretation solely on ultrasound interpretation and that they “most likely” had ultrasound evaluation experience beyond that of other examiners. They also cautioned against broad extrapolation of the study results, noting that the cohort was limited to patients with a post-op histological diagnosis.