An emerging machine learning ultrasound model demonstrated significantly higher sensitivity and specificity for predicting thyroid malignancy risk in comparison to the Thyroid Imaging Reporting and Data System (TI-RADS) and American Thyroid Association (ATA) guidelines.
For the retrospective study, recently published in European Radiology, researchers evaluated eight ultrasound-based machine learning algorithms as well as TI-RADS and the ATA guidelines in the assessment of 1,035 thyroid nodules for predicting malignancy risk.
The researchers found that the XGBoost machine learning algorithm provided an 88.3 percent area under the receiver operating characteristic curve (AUC) in contrast to 54.2 percent for TI-RADS and 44.3 percent for the ATA guidelines.
The XGBoost algorithm also offered greater than 7 percent higher sensitivity (84.2 percent) in comparison to the ATA guidelines (76.3 percent) and over 20 percent higher sensitivity than TI-RADS (63.2 percent), according to the study authors. The study authors pointed out that specificity with the XGBoost machine learning application was over 40 percent than the TI-RADS system (92.3 percent vs. 48.5 percent) and over 60 percent higher than the ATA guidelines (27.2 percent).
“Compared to ACR TI-RADS and ATA, which are the traditional risk stratification methods, the ML approaches in this study were superior in distinguishing benign from malignant thyroid lesions,” wrote lead study author Seyed Mahdi Hosseini Sarkhosh, M.D., who is affiliated with the Department of Industrial Engineering at the University of Garmsar in Garmsar, Iran, and colleagues.
The study authors also determined that the XGBoost machine learning algorithm was associated with a significant reduction of unnecessary fine-needle aspiration (FNA) rate (7 percent) in comparison to the TI-RADS system (43 percent) and the ATA guidelines (63 percent).
“This is critical given the clinical necessity to minimize rates of missed malignancies without dramatically increasing the number of unnecessary interventions,” emphasized Sarkhosh and colleagues.
Three Key Takeaways
1. Superior diagnostic performance. The XGBoost machine learning algorithm outperformed TI-RADS and ATA guidelines in predicting thyroid malignancy risk, achieving an AUC of 88.3 percent compared to 54.2 percent (TI-RADS) and 44.3 percent (ATA).
2. Improved sensitivity and specificity. The XGBoost model demonstrated significantly higher sensitivity (84.2 percent) and specificity (92.3 percent), reducing false positives and false negatives compared to TI-RADS and ATA guidelines.
3. Reduction in unnecessary FNAs. The machine learning approach led to a substantial reduction in unnecessary fine-needle aspirations (7 percent) vs. 43 percent for TI-RADS and 63 percent for ATA), improving clinical efficiency and patient outcomes.
Family history, the presence of pathological lymph nodes and a history of head and neck irradiation were noted as key factors in predicting malignancy risk, according to the study authors. For example, 44 percent of patients with malignant thyroid nodules had a positive family history vs. 7 percent of those with benign nodules.
However, the researchers noted that none of these factors are utilized in the TI-RADS system or ATA guidelines.
“The combination of clinical and demographic features, with ultrasound and FNA cytology significantly enhances the model’s performance and emphasizes their combined value in ML-based risk assessment tools,” added Sarkhosh and colleagues.
(Editor’s note: For related content, see “FDA Clears Enhanced 3D Ultrasound Platform for Thyroid Imaging,” “Study Shows PET/CT is Superior to SPECT/CT in Managing Patients with Primary Hyperparathyroidism” and “Can a CT-Based Radiomics Model Bolster Detection of Malignant Thyroid Nodules?”)
In terms of study limitations, the authors noted the retrospective nature of the research and the use of tenfold cross-validation for internal validation of the machine learning model. The researchers emphasized that prospective, multicenter studies are necessary for external validation of the study findings.