News|Articles|April 29, 2024

New Literature Review Finds ChatGPT Effective in Radiology in 84 Percent of Studies

While noting a variety of pitfalls with the chatbot ranging from hallucinations to improper citations, the review authors found the use of ChatGPT in radiology demonstrated “high performance” in 37 out of 44 studies.

One study noted that ChatGPT had an 88.9 percent accuracy rate in determining appropriate imaging for breast pain. Five studies found an 83.6 percent median agreement between ChatGPT and reference standards such as guidelines or radiologist decisions. The use of GPT-4 reportedly offers enhanced capabilities for responding to higher-order thinking questions in radiology.

These are some of the findings from a systematic review, recently published in Diagnostic and Interventional Imaging, of 44 studies evaluating the use of ChatGPT in radiology. Researchers examined findings from studies looking at the use of the chatbot for adjunctive support in decision making, structuring radiology reports, generating radiology reports, improving patient communication, performance on radiology board exams, and as a standalone tool, according to the study.

Overall, the review authors pointed out that 37 out of the 44 reviewed studies noted “high performance” of ChatGPT in radiology applications and the remaining seven studies found “lower performance.”

Seventy percent of studies (14/20) found that adjunctive use of ChatGPT provided significant improvement in radiologist decision making. The researchers also noted that 100 percent of studies examining the use of ChatGPT with radiology reports cited significant benefits in structuring and simplifying the reports (8/8) as well as generating radiology reports (4/4). Five out of six studies suggested that ChatGPT facilitated enhanced patient communication.

“The findings suggest that ChatGPT shows promise in 84.1% of the studies and has the potential to significantly contribute to five broad clinical areas of radiology, including providing diagnostic and clinical decision support, transforming, simplifying and generating radiology reports, patient communication and outcomes, and performance on radiology board examinations,” wrote lead review author Pedram Keshavarz, M.D., a postdoctoral research fellow affiliated with the Department of Radiological Sciences at the David Geffen School of Medicine at the University of California, Los Angeles, and colleagues.

The researchers noted that 11 of the 44 reviewed studies compared ChatGPT 3.5 vs. GPT-4 with over 90 percent of these studies reporting enhanced capabilities with GPT-4 in addressing questions that require more advanced reasoning.

Research from 2023 noted a 60 percent accuracy rate with ChatGPT in responding to higher-order radiology board-examination type questions and a 21 percent improvement (81 percent) in answering the same questions with GPT-4.

“When comparing ChatGPT versions (v3.5 vs. v4), ChatGPTv4 showed a superior contextual understanding of radiology-specific terms and imaging descriptions. Further studies have pointed to ChatGPTv4’s potential in generating structured radiology reports, providing detailed report explanations and benefits,” pointed out Keshavarz and colleagues.

Three Key Takeaways

Diagnostic support and decision making. ChatGPT demonstrated high performance in approximately 84 percent of the studies. More specifically, the chatbot demonstrated a high accuracy rate of 88.9 percent in determining appropriate imaging for breast pain and a median agreement of 83.6 percent with reference standards. Adjunctive use of ChatGPT significantly improved radiologist decision making in 70 percent of cases, suggesting its potential as a diagnostic support tool.

Radiology reporting. ChatGPT significantly benefits the structuring, simplification, and generation of radiology reports, as noted in all studies examining its use with radiology reports. This suggests its utility in enhancing efficiency and clarity in reporting processes, potentially improving workflow and communication.

Limitations in subspecialty radiology. While ChatGPT demonstrates promise as a supplementary tool in radiology decision-making, its standalone use reveals significant limitations, particularly in subspecialty areas such as interventional radiology. One study noted a 40 percent accuracy rate in answering basic questions related to interventional radiology, indicating its struggles with specialized knowledge domains. Furthermore, ChatGPT was outperformed by neuroradiologists in another study, underscoring the importance of human expertise in complex and nuanced radiological interpretations. These findings caution against over-reliance on ChatGPT and emphasize the necessity of human expertise, especially in subspecialty areas.

While suggesting the technical feasibility of standalone automation of radiology tasks with ChatGPT, the review authors noted the potential for significant errors, particularly in subspeciality areas of radiology. One of the reviewed studies revealed a 40 percent accuracy rate for ChatGPT in answering basic questions related to interventional radiology. Other research showed that ChatGPT was significantly outperformed by neuroradiologists.

“These findings show ChatGPT's role as a supplementary tool in clinical decision-making rather than a replacement for experienced professionals,” maintained Keshavarz and colleagues.

(Editor’s note: For related content, see “Can GPT-4 Improve Accuracy in Radiology Reports?,” “What New Research Reveals About ChatGPT and Ultrasound Detection of Thyroid Nodules” and “Can ChatGPT Pass a Radiology Board Exam?”)

Noting that all of the studies cited limitations with ChatGPT, ranging from hallucinations and fictitious references to privacy concerns, the study authors emphasized that none of the reviewed studies suggested the use of ChatGPT findings without radiologist review.

“Our study highlights (the) critical need for careful verification of the factual accuracy and relevance of responses from LLMs (large language models) when used in clinical settings, where incorrect information could disrupt medical operations or lead to detrimental outcomes, regardless of the response's confidence,” added Keshavarz and colleagues.

Stay at the forefront of radiology with the Diagnostic Imaging newsletter, delivering the latest news, clinical insights, and imaging advancements for today’s radiologists.

Subscribe Now!

Latest CME

In-Person + Virtual Event

Live Tumor Board: Squamous Cell Carcinoma of the Head & Neck – Post-CRT Decisions in the Locally Advanced Setting

February 19, 2026

In-Person Event

43rd Annual Miami Breast Cancer Conference®

March 5-8, 2026

In-Person Event

19th Annual New York GU Cancers Congress™

March 13-14, 2026

Video

Mastering Advances in Managing Unresectable and Metastatic NSCLC—Immunotherapy, Targeted Therapies, and Emerging Strategies

Marina Chiara Garassino, MD; Sarah Goldberg, MD, MPH; Biagio Ricciuti, MD, PhD

Video

Cases & Conversations™: Expert Perspectives on Leveraging Recent Advances to Transform SCLC Treatment

Jacob Sands, MD; Anne Chiang, MD, PhD; Alissa J. Cooper, MD

Multimedia

Community Practice Connections™: Empowering Interventional Radiologists in the Emerging Era of Oncolytic Immunotherapies for Melanoma

Yana G. Najjar, MD; Douglas B. Johnson, MD, MSCI; Rahul A. Sheth, MD, FSIR

Video

(CME Credit) Advancing Outcomes in Limited-Stage Small Cell Lung Cancer: From Evidence to Practice

Lauren Averett Byers, MD; Percy Lee, MD, FASTRO; Erminia Massarelli, MD, PhD, MS

Video

PER Tumor Board®: Applying Recent Advances to Transform the Treatment Paradigm in SCLC—Expert Perspectives on New Approvals and Emerging Strategies

Jonathan W. Goldman, MD; Percy Lee, MD, FASTRO; Erminia Massarelli, MD, PhD, MS; Misty D. Shields, MD, PhD

New Literature Review Finds ChatGPT Effective in Radiology in 84 Percent of Studies

Three Key Takeaways

Newsletter

Related Content

Radiology Roundup of New FDA Clearances — February 1 — February 7

FDA Clears AI-Powered Triage Platform for Digital Breast Tomosynthesis

Is AI Better Than Neuroradiologists at Evaluating Aneurysm Growth on CTA and MRA Scans?

FDA Clears 3T MRI Device for Neonates and Infants

Leading Breast Radiologists Discuss the Recent Lancet Study on AI and Interval Breast Cancer

Latest CME

Live Tumor Board: Squamous Cell Carcinoma of the Head & Neck – Post-CRT Decisions in the Locally Advanced Setting

43rd Annual Miami Breast Cancer Conference®

19th Annual New York GU Cancers Congress™

Mastering Advances in Managing Unresectable and Metastatic NSCLC—Immunotherapy, Targeted Therapies, and Emerging Strategies

Cases & Conversations™: Expert Perspectives on Leveraging Recent Advances to Transform SCLC Treatment

Community Practice Connections™: Empowering Interventional Radiologists in the Emerging Era of Oncolytic Immunotherapies for Melanoma

(CME Credit) Advancing Outcomes in Limited-Stage Small Cell Lung Cancer: From Evidence to Practice

PER Tumor Board®: Applying Recent Advances to Transform the Treatment Paradigm in SCLC—Expert Perspectives on New Approvals and Emerging Strategies

Trending on Diagnostic Imaging

Leading Breast Radiologists Discuss the Recent Lancet Study on AI and Interval Breast Cancer

FDA Clears AI-Powered Triage Platform for Digital Breast Tomosynthesis

Is AI Better Than Neuroradiologists at Evaluating Aneurysm Growth on CTA and MRA Scans?

Radiology Roundup of New FDA Clearances — February 1 — February 7

Study Shows Photon-Counting CT Reduces Radiation Exposure by 66 Percent for Patients with Lung Cancer