An emerging machine learning model may facilitate automated assessment of background parenchymal enhancement (BPE) on breast magnetic resonance imaging (MRI) exams for women with dense breasts.
In a secondary analysis of the Dense Tissue and Early Breast Neoplasm Screening trial, recently published in the European Journal of Radiology, researchers compared machine learning model assessment of BPE versus experienced radiologist assessment of BPE in 4,553 women (mean age of 56) with extremely dense breasts.
The cohort was comprised of 3,436 women with minimal BPE (75 percent), 663 women with mild BPE (15 percent), 273 study participants with moderate BPE (6 percent) and 181 women with marked BPE (4 percent). The researchers noted imputed BPE scoring for 120 women who otherwise did not have BPE information. The machine learning model’s capability for predicting BPE was built upon 15 extracted quantitative MRI features of fibroglandular tissue, according to the study.
The researchers found that the machine learning model had a 95 percent sensitivity rate for minimal BPE in comparison to 16 percent for mild BPE, 13 percent for moderate BPE and 36 percent for marked BPE. However, the machine learning model also demonstrated 95 percent, 96 percent, and 97 percent specificity rates, respectively, for mild, moderate, and marked BPE. The study authors noted a 46 percent specificity rate for minimal BPE.
The researchers also pointed out that automated BPE assessment with the machine learning model had a 2.12 hazard ratio (HR) for breast cancer occurrence in comparison to a 1.97 HR for manual BPE assessment.
“The results indicate that it is feasible to automate the evaluation of BPE category in women with extremely dense breasts, although the accuracy for minimal BPE is superior to that for other BPE categories. The inter-observer variability in BPE has been well documented … However, our results show that the underlying association between BPE and breast cancer occurrence – expressed as hazard ratio – was not affected and was comparable with that between manually rated BPE in the DENSE trial and breast cancer occurrence,” wrote study co-author Kenneth G.A. Gilhuijs, Ph.D., an associate professor at the University Medical Center Utrecht in Utrecht, the Netherlands, and colleagues.
Based on majority voting at eight hospitals participating in the study, the study authors said the accuracy of the machine learning model for predicting BPE ranged between 56 percent to 84 percent with a weighted mean accuracy of 76 percent. The researchers suggested that higher BPE scoring was a common practice by radiologists at the one hospital that reported a 56 percent accuracy rate for the machine learning model.
“Consequently, hospital 6 had the lowest percentage of women in the minimal BPE category (55 %), while the sensitivity of the machine-learning model for this category was the highest. The percentage of women in the minimal BPE category ranged between 67 % and 80 % in the other hospitals,” maintained Gilhuijs and colleagues.
Three Key Takeaways
- Automated assessment potential. The study demonstrates the potential of a machine learning model to automate the assessment of background parenchymal enhancement (BPE) on breast MRI exams, particularly for women with extremely dense breasts. This suggests a promising avenue for reducing reliance on manual assessment, which can be subject to inter- and intra-reader variability.
- Accuracy variability. The accuracy of the machine learning model for predicting BPE varied across different hospitals, ranging from 56% to 84% with a weighted mean accuracy of 76%. Variability in accuracy may be influenced by differing practices among radiologists, with hospitals demonstrating higher rates of BPE scoring leading to lower accuracy for the machine learning model. Understanding these variations is crucial for implementing such models effectively across different health-care settings.
- Clinical implications. Despite variations in accuracy, automated BPE assessment using the machine learning model showed a comparable hazard ratio for breast cancer occurrence compared to manual assessment. This suggests that while the accuracy for minimal BPE might be superior, the underlying association between BPE and breast cancer occurrence remains consistent. Implementing automated BPE assessment could potentially overcome limitations associated with inter- and intra-reader variability in manual assessment according to the BI-RADS lexicon. However, further studies in varied populations and settings are needed to validate and optimize the model's performance.
Noting radiologist variation with BPE assessment via the BI-RADS classification system, the study authors said automated BPE assessments have promise in predicting breast cancer.
“In clinical practice, BPE is rated by radiologists according to the BIRADS lexicon, and is subject to inter- and intrareader variation,which may have hindered its role as a predictive imaging biomarker of breast cancer. Quantitative evaluation of BPE can overcome this disadvantage,” noted Gilhuijs and colleagues.
(Editor’s note: For related content, see “What a New MRI Study Reveals About Enhancing Parenchyma in Women with Extremely Dense Breasts,” “Five Takeaways from New Breast MRI Literature Review” and “Study: Abbreviated MRI and DBT Offer Comparable Breast Cancer Detection in Dense Breasts.”)
In regard to study limitations, the authors noted that 90 percent of the cohort was comprised of women with minimal or mild BPE and that the unbalanced cohort may have affected the machine learning model. While there was external validation with multiple facilities, the researchers emphasized that studies of the model in other countries are necessary.