News|Articles|December 21, 2023

Can ChatGPT and Bard Bolster Decision-Making for Cancer Screening in Radiology?

In a study examining the potential of the large language models ChatGPT-4 and Bard to follow ACR Appropriateness Criteria for breast cancer, lung cancer, ovarian cancer and colorectal cancer screening, researchers noted “impressive accuracy in making radiologic clinical decisions.”

In what may be the first study to compare the use of large language models across multiple areas of clinical screening, researchers suggested that ChatGPT-4 and Bard may play a beneficial role with radiology decision-making for assessment of common cancers such as breast cancer and lung cancer.

For the study, recently published in Academic Radiology, researchers examined the use of prompt engineering to enhance the accuracy of the large language models relating to the appropriate use of imaging for conditions including breast cancer, ovarian cancer, colorectal cancer, and lung cancer.

Employing American College of Radiology (ACR) Appropriateness Criteria, the researchers compared the performance of ChatGPT-4 (OpenAI) and Bard (Google) with open-ended (OE) prompts and more specific select all that apply (SATA) prompts.

For breast cancer screening, the researchers noted fairly similar accuracy between ChatGPT-4 and Bard for both prompts. ChatGPT-4 had a 1.82 (out of 2) average OE prompt score in comparison to 1.89 for Bard. Bard demonstrated an 82 percent accuracy with SATA prompts in breast cancer screening and ChatGPT-4 offered 85 percent accuracy, according to the study authors.

“We observed that ChatGPT-4 and Google Bard displayed impressive accuracy in making radiologic clinical decisions when prompted in either OE or SATA formats,” wrote study co-author Young H. Kim, M.D., Ph.D., who is affiliated with the University of Massachusetts Chan Medical School in Worcester, Mass., and colleagues.

However, the study authors did note a few differences between the large language models.

• Average scoring with the use of predefined options in SATA prompts showed that ChatGPT-4 outperformed Bard across all cancer imaging in the study with the difference being more pronounced with ovarian cancer screening. Overall, researchers pointed out an 83 percent average accuracy score for ChatGPT-4 in comparison to 70 percent for Bard.

• For average OE prompt scoring, ChatGPT-4 outperformed Bard for lung cancer and ovarian cancer screening while Bard was slightly better for breast and colorectal cancer screening, according to the study.

• Assessment of the large language models for ovarian cancer screening revealed the most significant difference. For ovarian cancer screening, Bard had an OE prompt score of .50 (out of 2) in comparison to 1.50 for ChatGPT-4. The researchers also noted 41 percent accuracy for Bard on SATA prompts in contrast to 70 percent for ChatGPT-4 on ovarian cancer screening.

Three Key Takeaways

Comparable accuracy in breast cancer screening. The study found that both ChatGPT-4 and Google Bard demonstrated impressive accuracy in making radiologic clinical decisions for breast cancer screening. The accuracy scores were fairly similar between the two models, with ChatGPT-4 scoring 1.82 (out of 2) on average with open-ended prompts, compared to Bard's score of 1.89. For select all that apply (SATA) prompts, Bard achieved 82 percent accuracy, while ChatGPT-4 offered a slightly higher accuracy of 85 percent.

Differential performance across cancer types. The study observed differences in the performance of ChatGPT-4 and Bard across different cancer types. Notably, ChatGPT-4 outperformed Bard in average scoring with predefined options in SATA prompts across all cancer imaging, with a more significant difference in ovarian cancer screening. ChatGPT-4 achieved an 83 percent average accuracy score, while Bard scored 70 percent. For ovarian cancer screening specifically, ChatGPT-4 had a higher accuracy in both open-ended and SATA prompts compared to Bard.

Effectiveness of prompt engineering. The researchers highlighted the importance of prompt engineering in improving the accuracy of responses from large language models (LLMs). While both open-ended (OE) and select all that apply (SATA) prompts were used, the study found that OE prompts were more effective in enhancing performance for both ChatGPT-4 and Bard.

(Editor’s note: For related content, see “Can ChatGPT Pass a Radiology Board Exam?,” “Can ChatGPT Have an Impact in Radiology?” and “Can ChatGPT be an Effective Patient Communication Tool in Radiology?”)

Additionally, while OE prompts improved the performance of both large language models, the study authors did not see similar benefits with the use of SATA prompts. While acknowledging the potential for bias in the training data toward OA prompts, the researchers said the flexibility of OE prompts may be more optimal than SATA prompts.

“ … Our findings support the idea of implementing (prompt engineering) in an OE format to improve the accuracy of the responses in unique clinical settings, such as when imaging modalities are not provided or when clinicians are unable to list all the possible imaging modalities for a given scenario,” added Kim and colleagues.

In regard to study limitations, the study authors conceded the scoring of responses from the LLM models is subjective and noted there were only two scorers for the study. The researchers also noted limitations with general extrapolation of the study findings given the study was focused on assessment of the LLMs in screening for four types of cancer and clinical guidelines established by the ACR.

Stay at the forefront of radiology with the Diagnostic Imaging newsletter, delivering the latest news, clinical insights, and imaging advancements for today’s radiologists.

Subscribe Now!

Latest CME

In-Person + Virtual Event

Live Tumor Board: Squamous Cell Carcinoma of the Head & Neck – Post-CRT Decisions in the Locally Advanced Setting

February 19, 2026

In-Person Event

43rd Annual Miami Breast Cancer Conference®

March 5-8, 2026

Video

Inaugural Brain & Spine Metastases Conference: Evolving Practice and Emerging Therapies

Manmeet Ahluwalia, MD, MBA

In-Person Event

19th Annual New York GU Cancers Congress™

March 13-14, 2026

Video

Mastering Advances in Managing Unresectable and Metastatic NSCLC—Immunotherapy, Targeted Therapies, and Emerging Strategies

Marina Chiara Garassino, MD; Sarah Goldberg, MD, MPH; Biagio Ricciuti, MD, PhD

Video

Cases & Conversations™: Expert Perspectives on Leveraging Recent Advances to Transform SCLC Treatment

Jacob Sands, MD; Anne Chiang, MD, PhD; Alissa J. Cooper, MD

Multimedia

Community Practice Connections™: Empowering Interventional Radiologists in the Emerging Era of Oncolytic Immunotherapies for Melanoma

Yana G. Najjar, MD; Douglas B. Johnson, MD, MSCI; Rahul A. Sheth, MD, FSIR

Video

(CME Credit) Advancing Outcomes in Limited-Stage Small Cell Lung Cancer: From Evidence to Practice

Lauren Averett Byers, MD; Percy Lee, MD, FASTRO; Erminia Massarelli, MD, PhD, MS

Video

PER Tumor Board®: Applying Recent Advances to Transform the Treatment Paradigm in SCLC—Expert Perspectives on New Approvals and Emerging Strategies

Jonathan W. Goldman, MD; Percy Lee, MD, FASTRO; Erminia Massarelli, MD, PhD, MS; Misty D. Shields, MD, PhD

Can ChatGPT and Bard Bolster Decision-Making for Cancer Screening in Radiology?

Three Key Takeaways

Newsletter

Related Content

Diagnostic Imaging's Weekly Scan: February 1 — February 7

Radiology Roundup of New FDA Clearances — February 1 — February 7

FDA Clears AI-Powered Triage Platform for Digital Breast Tomosynthesis

Is AI Better Than Neuroradiologists at Evaluating Aneurysm Growth on CTA and MRA Scans?

FDA Clears 3T MRI Device for Neonates and Infants

Latest CME

Live Tumor Board: Squamous Cell Carcinoma of the Head & Neck – Post-CRT Decisions in the Locally Advanced Setting

43rd Annual Miami Breast Cancer Conference®

Inaugural Brain & Spine Metastases Conference: Evolving Practice and Emerging Therapies

19th Annual New York GU Cancers Congress™

Mastering Advances in Managing Unresectable and Metastatic NSCLC—Immunotherapy, Targeted Therapies, and Emerging Strategies

Cases & Conversations™: Expert Perspectives on Leveraging Recent Advances to Transform SCLC Treatment

Community Practice Connections™: Empowering Interventional Radiologists in the Emerging Era of Oncolytic Immunotherapies for Melanoma

(CME Credit) Advancing Outcomes in Limited-Stage Small Cell Lung Cancer: From Evidence to Practice

PER Tumor Board®: Applying Recent Advances to Transform the Treatment Paradigm in SCLC—Expert Perspectives on New Approvals and Emerging Strategies

Trending on Diagnostic Imaging

Leading Breast Radiologists Discuss the Recent Lancet Study on AI and Interval Breast Cancer

Radiology Roundup of New FDA Clearances — February 1 — February 7

Is AI Better Than Neuroradiologists at Evaluating Aneurysm Growth on CTA and MRA Scans?

FDA Clears AI-Powered Triage Platform for Digital Breast Tomosynthesis

Diagnostic Imaging's Weekly Scan: February 1 — February 7