Using free text reporting for radiology studies vary considerably in length and are difficult to interpret as a result.
Free text reporting results in extensive variability in report length and report terms used and is difficult for machines to interpret, according to a study published in the Journal of the American College of Radiology.
Researchers from the Milton S. Hershey Medical Center in Hershey, Penn., sought to quantify the variability of language in free text reports of pulmonary embolus (PE) studies and to gauge the informativeness of free text to predict PE diagnosis using machine learning as proxy for human understanding.
The researchers evaluated 1,133 consecutive chest CTs with contrast studies performed under a PE protocol. Commercial text-mining and predictive analytics software was used to parse and describe all report text and to generate a suite of machine learning rules that sought to predict the “gold standard” radiological diagnosis of PE.
The results demonstrated an extensive variation in the length of findings section and impression section texts across the reports, only marginally associated with a positive PE diagnosis. A marked concentration of terms was found, such as 20 words were used in the findings section of 93 percent of the reports, and 896 of 2,296 distinct words were each used in only one report’s impression section. In the validation set, machine learning rules had perfect sensitivity but imperfect specificity, a low positive predictive value of 73 percent, and a misclassification rate of 3 percent.
The researchers concluded that free text reporting was difficult for the machines to interpret due to the extensive variability of report length and number of terms used. They suggested that this presents potential difficulties for human recipients in fully understanding such reports. “These results support the prospective assessment of the impact of a fully structured report template with at least some mandatory discrete fields on ease of use of reports and their understanding,” they wrote.
New Study Examines Short-Term Consistency of Large Language Models in Radiology
November 22nd 2024While GPT-4 demonstrated higher overall accuracy than other large language models in answering ACR Diagnostic in Training Exam multiple-choice questions, researchers noted an eight percent decrease in GPT-4’s accuracy rate from the first month to the third month of the study.
The Reading Room: Artificial Intelligence: What RSNA 2020 Offered, and What 2021 Could Bring
December 5th 2020Nina Kottler, M.D., chief medical officer of AI at Radiology Partners, discusses, during RSNA 2020, what new developments the annual meeting provided about these technologies, sessions to access, and what to expect in the coming year.