• AI
  • Molecular Imaging
  • CT
  • X-Ray
  • Ultrasound
  • MRI
  • Facility Management
  • Mammography

Study Assesses Lung CT-Based AI Models for Predicting Interstitial Lung Abnormality

News
Article

A machine-learning-based model demonstrated an 87 percent area under the curve and a 90 percent specificity rate for predicting interstitial lung abnormality on CT scans, according to new research.

Emerging research suggests machine learning models may facilitate earlier detection of interstitial lung abnormalities (ILAs) on computed tomography (CT) scans.

For the retrospective study, recently published in Radiology, researchers assessed the capability of 12 machine learning models on 1,382 lung CT scans (mean patient age of 67) for predicting the development of ILAs. The models included section inference models that utilized two-label and three-label methods for determining indeterminate sections, and case inference models developed with the machine learning classifiers support vector machine (SVM), random forest (RF) and convolutional neural network (CNN), according to the study.

The researchers found that the machine learning model that combined a three-label method for section inference with a two-label method and RF classifier for the case inference part of the model demonstrated an 87 percent AUC and a 90 percent specificity rate for predicting ILAs.

Study Assesses Lung CT-Based AI Models for Predicting Interstitial Lung Abnormality

Here one can see CT scans with noted interstitial lung abnormalities. The images reveal a subpleural ground-glass abnormality bilaterally (A and B) as well as traction bronchiectasis in the left lung (A). Other findings include a slight subpleural bilateral ground-glass deformity (C) and a more widespread ground-glass deformity (D). (Images courtesy of Radiology.)

However, noting this model’s suboptimal sensitivity (64 percent) and interreader agreement (46 percent), the study authors emphasized the importance of establishing suitable thresholds prior to integrating models into radiology workflows.

“Appropriate threshold settings need to be identified for clinical application. It is important to note that this result was based on visual assessment as ground truth, a common but imperfect method for ILA evaluation. … Determination of the best model or threshold should be based on more reliable clinical outcomes, such as survival,” wrote lead study author Akinori Hata, M.D., who is affiliated with the Center for Pulmonary Functional Imaging in the Department of Radiology at Brigham and Women’s Hospital and Harvard Medical School in Boston, and colleagues.

Three Key Takeaways

1. Machine learning efficacy. The study showed that a machine learning model combining a three-label method for section inference and a two-label method with a random forest (RF) classifier achieved an 87 percent area under the curve (AUC) and 90 percent specificity for predicting ILAs. However, the sensitivity was lower (64 percent).

2. Model comparisons. There was no significant difference in AUC performance between the support vector machine (SVM), RF, and convolutional neural network (CNN) classifiers with researchers noting that limited data set sizes may have impacted the performance of CNN models.

3. Threshold and clinical outcome relevance. Researchers highlighted the need for appropriate threshold settings and emphasized that models should ideally be evaluated based on reliable clinical outcomes like survival rather than purely visual assessments.

The study authors also noted no significant differences in AUCs between SVM, RF and CNN classifiers with respect to models with two-label section inference and three-label case inference (ranging between 80 to 83 percent), those with three-label section inference and two-label case inference (86 to 87 percent), and models with three-label section inference and three-label case inference (ranging between 85 to 86 percent).

“While RF and SVM classifiers use the probability of each section directly as a feature, a CNN classifier generates new features through convolution and other operations. This process requires a large number of cases to generate effective features, so the limited sample size in this study might have resulted in the lower performance of the CNN models compared with the RF and SVM models,” noted Hata and colleagues.

(Editor’s note: For related content, see “FDA Clears CT-Based AI Software for Assessing Interstitial Lung Disease,” “FDA Clears AI-Powered CT Assessment Tool for Lung Fibrosis” and “Can Deep Learning Bolster CT Detection and Classification of Usual Interstitial Pneumonia?”)

In regard to study limitations, the authors conceded the lack of external validation and clinical outcome evaluation. They also noted the study’s inclusion of section images 2 < 2.5 mm, and contrast-enhanced CT for 75 percent of the reviewed exams, two factors that can affect the detection of ILA, according to the researchers.

Recent Videos
How Will the New FDA Guidance Affect AI Software in Radiology?: An Interview with Nina Kottler, MD, Part 2
How Will the New FDA Guidance Affect AI Software in Radiology?: An Interview with Nina Kottler, MD, Part 1
Teleradiology and Breast Imaging: Keys to Facilitating Personalized Service, Efficiency and Equity
Radiology Study Finds Increasing Rates of Non-Physician Practitioner Image Interpretation in Office Settings
Can Fiber Optic RealShape (FORS) Technology Provide a Viable Alternative to X-Rays for Aortic Procedures?
Does Initial CCTA Provide the Best Assessment of Stable Chest Pain?
Nina Kottler, MD, MS
The Executive Order on AI: Promising Development for Radiology or ‘HIPAA for AI’?
Practical Insights on CT and MRI Neuroimaging and Reporting for Stroke Patients
Related Content
© 2025 MJH Life Sciences

All rights reserved.