A voice recognition system didn't fare well in a study by researchers from Thomas Jefferson University in Philadelphia that reviewed the number of errors on signed reports from attending radiologists. A spokesperson for the dictation company said the numbers can be misleading.
A voice recognition system didn't fare well in a study by researchers from Thomas Jefferson University in Philadelphia that reviewed the number of errors on signed reports from attending radiologists. A spokesperson for the dictation company said the numbers can be misleading.
Dr. Ronald Dolin and colleagues reviewed 395 consecutive reports from 41 attending radiologists. All reports were generated using PowerScribe 4.7, which had been in use for about 16 months.
Researchers classified errors as significant if they altered or obscured the meaning of the sentence in which they appeared. Dictation errors were categorized into 10 subtypes including missing or extra words, wrong words, typographical or grammatical errors, nonsense phrases with unknown meaning, and inaccuracies in dictation date.
Investigators found 239 errors in 146 reports. They identified at least one error in reports from 40 of the 41 attending radiologists. Significant errors accounted for 17% of the total. Twenty-two radiologists had at least one significant error on a report, while five had two or more. Nearly 70% of insignificant errors were the wrong word or missing or extra words. Dolin concluded that a periodic audit of reports may identify specific error patterns and alert radiologists to alter their dictation practices to prevent future significant errors.
"It's important to note that 249 out of the 395 reports had no errors," said Peter Durlach, senior vice president of healthcare marketing and product strategy at Nuance, which sells PowerScribe.
Durlach said the product has an accuracy rate of 97% and that Dolin's results actually reflect a 1.2% error rate. To figure it out, he estimated the average word length of the reports to be 50. He then divided the number of errors (239) by the total word count (19,750) and arrived at an error rate of 1.2%.
"Some accounts of the study reported a double-digit error rate," Durlach said. "Overall, Dolin found that 37% of the reports had at least one error. Some people mistakenly interpreted that as an overall 37% error rate."
Manufacturers of voice recognition software should implement a confidence level below which any uncertain phrase or word would be highlighted, said Dr. William Morrison, director of musculoskeletal imaging at TJU. He lamented the fact that these systems do not even have grammar checks, which could reduce errors.
"When we had humans doing this, they'd flag anything that went below a certain confidence level of understanding. You want the program to do the same," Morrison said.
Durlach said the company is aggressively working on improving flagging algorithms.
New Study Examines Short-Term Consistency of Large Language Models in Radiology
November 22nd 2024While GPT-4 demonstrated higher overall accuracy than other large language models in answering ACR Diagnostic in Training Exam multiple-choice questions, researchers noted an eight percent decrease in GPT-4’s accuracy rate from the first month to the third month of the study.
FDA Grants Expanded 510(k) Clearance for Xenoview 3T MRI Chest Coil in GE HealthCare MRI Platforms
November 21st 2024Utilized in conjunction with hyperpolarized Xenon-129 for the assessment of lung ventilation, the chest coil can now be employed in the Signa Premier and Discovery MR750 3T MRI systems.
FDA Clears AI-Powered Ultrasound Software for Cardiac Amyloidosis Detection
November 20th 2024The AI-enabled EchoGo® Amyloidosis software for echocardiography has reportedly demonstrated an 84.5 percent sensitivity rate for diagnosing cardiac amyloidosis in heart failure patients 65 years of age and older.