Can ChatGPT Pass a Radiology Board Exam?

May 16, 2023

Video

In a recent interview, Rajesh Bhayana, M.D., shared insights from new research that compared the abilities of ChatGPT-3.5 and ChatGPT-4 to answer text-based questions akin to those found on a radiology board examination.

The recently released ChatGPT-4 (Open AI) may offer more advanced reasoning, be less prone to hallucinations, and be more capable of passing a radiology board exam than ChatGPT-3.5 (Open AI), according to newly published research.

In prospective studies published recently in Radiology, researchers assessed the performance of ChatGPT-3.5 and ChatGPT-4 in answering 150 text-based multiple-choice questions akin to those found on a radiology board examination.

The researchers found that the ChatGPT-4 model correctly answered more than 80 percent of the questions in comparison to 69 percent for ChatGPT-3.5. ChatGPT-4 also demonstrated a greater than 20 percent improvement over ChatGPT-3.5 on questions that required higher-order thinking, including description of imaging findings, classifications, and application of concepts, according to the study authors.

In a recent interview, Rajesh Bhayana, MD, FRCPC, the lead author of the studies, said it is apparent that the technology with ChatGPT is showing significant improvement.

“The fact that ChatGPT-4 performed better than ChatGPT-3.5 and had less frequent incorrect answers and also performed better with higher-order reasoning suggests the frequency of hallucinations is in fact decreasing,” noted Dr. Bhayana, an abdominal radiologist, and technology lead in the Department of Medical Imaging at the University of Toronto in Canada.

(Editor’s note: For related content, see “Can ChatGPT Have an Impact in Radiology?” and “Can ChatGPT Provide Appropriate Information on Mammography and Other Breast Cancer Screening Topics?”)

While Dr. Bhayana said there is significant potential with the use of ChatGPT in radiology, he cautioned that accuracy remains an issue and use of the technology still requires rigorous fact checking.

“It was very impressive that these models, based on the way they work and based on the fact that they are general models, performed so well in a specialty like radiology where language is so critical,” maintained Dr. Bhayana. “(But) it still does get things wrong. When it does get those things wrong, it uses very confident language. If you’re a novice and you can’t separate fact from fiction, it can be tough to know what’s right and what’s wrong. Especially for education, especially for novices when you’re looking up that information and learning something for the first time, you can’t rely on it. If you do use it, you have to always fact check it.”

For more insights from Dr. Bhayana, watch the video below.

Related Content

What is the Best Use of AI in CT Lung Cancer Screening?

Jeff Hall

April 18th 2025

Article

In comparison to radiologist assessment, the use of AI to pre-screen patients with low-dose CT lung cancer screening provided a 12 percent reduction in mean interpretation time with a slight increase in specificity and a slight decrease in the recall rate, according to new research.

The Reading Room Podcast: Emerging Concepts in Breast Cancer Screening and Health Equity Implications, Part 3

Jeff Hall

September 1st 2023

Podcast

In the third episode of a three-part podcast, Anand Narayan, M.D., Ph.D., and Amy Patel, M.D., discuss the challenges of expanded breast cancer screening amid a backdrop of radiologist shortages and ever-increasing volume on radiology worklists.

FDA Expands Clearance for AI Mammography Software for Breast Arterial Calcification Detection

Jeff Hall

April 16th 2025

Article

The additional FDA 510(k) clearance for the AI-powered cmAngio platform covers use of the software for GE HealthCare mammography systems.

The Reading Room: Artificial Intelligence: What RSNA 2020 Offered, and What 2021 Could Bring

Whitney J. Palmer

December 5th 2020

Podcast

Nina Kottler, M.D., chief medical officer of AI at Radiology Partners, discusses, during RSNA 2020, what new developments the annual meeting provided about these technologies, sessions to access, and what to expect in the coming year.

FDA Clears Emerging AI-Enabled Software for Cardiac Ultrasound

Jeff Hall

April 15th 2025

Article

Recent research has demonstrated that the AI software HeartFocus enabled novice health-care providers to achieve greater than 85 percent agreement with expert sonographers in assessing echocardiographic parameters.

Can CT-Based AI Radiomics Enhance Prediction of Recurrence-Free Survival for Non-Metastatic ccRCC?

Jeff Hall

April 14th 2025

Article

In comparison to a model based on clinicopathological risk factors, a CT radiomics-based machine learning model offered greater than a 10 percent higher AUC for predicting five-year recurrence-free survival in patients with non-metastatic clear cell renal cell carcinoma (ccRCC).