How informatics and inform and improve patient outcomes in radiology, from ACR 2016.
Radiology has done a great job of mastering the first two levels of efficacy in diagnostic imaging: technical efficacy and diagnostic accuracy efficacy, but improving the value of care lies in those not yet mastered levels, Hanna Zafar, MD, assistant professor of radiology at the Hospital of the University of Pennsylvania, said at ACR 2016.
“People want to know about diagnostic efficacy, treatment efficacy, patient outcomes, and societal efficacy,” she said. “And that’s where a lot of the interest is in improving the value of care we deliver in radiology.”
Radiologists need to be at the forefront of this movement and conduct the research that allows imaging to be correlated with patient outcomes, Zafar said.
Zafar applauded women’s imaging and BI-RADS for its successful correlation with patient outcomes. BI-RADS’ success stems from the set lexicon that translates into codes that all radiologists understand, she said.
“The codes are then associated with a set management algorithm and we know what the likelihood of malignancy is that’s associated with each code,” she added.
Zafar described an initiative at Penn called Code Abdomen in which they tried to mimic the same idea, and hopefully the same successful measure of patient outcomes. They decided to tackle non-emergent actionable imaging findings. Abdominal imaging, in particular, is rife with non-emergent actionable imaging findings, which are found not only in patients with known cancer diagnoses, but also in patients with no cancer diagnoses, she said.
The problem with these findings, Zafar said, lies in the free text report and its heterogeneity as to how radiologists report findings.
“We don’t do a great job internally at agreeing on the words or the lexicon that we should be putting into those reports,” she said. “In the free text report, you are going to have a very hard time identifying patients with those worrisome findings between two different radiology reports.”
The communication piece is a significant factor. Oftentimes, radiologists rely on the provider to tell the patient to get follow-up, but there is also the question as to whether follow-up is clinically indicated, which the radiologists don’t always know because their workflow doesn’t provide time to dig around in the EMR.
In Code Abdomen, Zafar’s group combined a prospective and retrospective approach in an effort to standardize radiology report codes or language while using natural language processing to mine large amounts of data. The project required months of preparation and a multidisciplinary team effort.
At first, they focused on four organs that commonly have non-emergent actionable findings: liver, pancreas, kidney, and adrenals. The system assigns masses to three different categories, benign, indeterminate, and suspicious.
“It’s similar to BI-RADS, but we are starting off way back in the beginning of the system,” she said. “We don’t have the lexicon yet, we don’t have evidence-based guidelines for every type of mass in every organ.”
Radiologists were instructed to use Bosniak or LI-RADS when applicable, but if there wasn’t a system, they should put in their Code Abdomen impression and their recommendations for how to proceed. The codes range from 0 to 7, and 99 which is for technically inadequate examinations. Each code classifies how the masses break down into the three main categories with a description and specific language that is associated with each code. It’s not enough for radiologists to just describe what they see, though.
“We have forced the radiologist to make up their mind and commit,” she said. “How concerned are you that this is a cancer?”
About two years after the project was first proposed came another big change for radiologists.
“We began to actually require that codes for indeterminate masses had to give a modality and a timing for follow up,” Zafar said. “Radiologists had to give some sort of guidance to the ordering clinician as to what they think a reasonable follow-up plan would be.”
Once the follow-up became a required data element, Zafar’s team was able to look at all of the data as a whole and move towards measuring outcomes.
“The standardized codes facilitate identification of detailed clinical data and allows you to start measuring variability between organs, modalities, and radiologists,” she said.
They can identify which organs have the vast majority of these findings (liver and kidney), and what the vast majority of lesions are (no mass or benign), they can also look at codes in a single organ and track how those have changed over time.
Patient Outcomes
All of this data was a step towards measuring patient outcomes. Most notably, this system allowed Zafar’s team to identify patients at risk of falling through the cracks.
“We know that physicians fail to acknowledge abnormal results in the EMR about 1/3 of the time, and we also know that 1/3 of providers don’t tell patients about abnormal test results,” she said. But in order to study a problem, you have to be able to measure it and, historically, it has been difficult for radiologists to identify what proportion of patients aren’t getting their recommended follow up.
By standardizing the report language, Zafar’s team was able to build a database that sits on the RIS and mines information out of the RIS to identify patients that have the applicable codes. The database can then check within the RIS if patients have any relevant completed or scheduled imaging, which allows them to identify patients at risk of not receiving recommended follow up.
A three-month manual review of the data shed light on a few kinks. Radiologists weren’t familiar with the codes so the number of patients requiring follow up was inflated. They were putting the too small to characterize lesions in the 0 and 3 bucket because they didn’t know how else to classify them.
“After a while, we started talking about this and they realized that’s not really appropriate,” she said. “The overwhelming majority of these are going to be benign, and we know that based on the literature, unless the patient has a known cancer.”
Incomplete chart documentation also posed a challenge.
“We don’t do a great job of capturing data on who our providers are that are taking care of patients,” she said. This is especially a problem in the emergency department or with inpatients. Also, providers don’t always document why follow up is not clinically indicated, she said.
Zafar’s team also used the data to look at how many patients in the same time interval actually had a path-proven cancer or underwent a surgery that showed a malignancy. Nine percent of their applicable patient population had cancer.
Leadership buy-in and patience were imperative to this project, Zafar said. But she urged that initiatives like this one can be achieved outside of tertiary care systems.
The coding system shed light on the power of informatics, not just in standardizing radiology reporting and communication, but in identifying cohorts of patients and the ability to track their outcomes.
New Study Examines Short-Term Consistency of Large Language Models in Radiology
November 22nd 2024While GPT-4 demonstrated higher overall accuracy than other large language models in answering ACR Diagnostic in Training Exam multiple-choice questions, researchers noted an eight percent decrease in GPT-4’s accuracy rate from the first month to the third month of the study.
The Reading Room Podcast: Emerging Trends in the Radiology Workforce
February 11th 2022Richard Duszak, MD, and Mina Makary, MD, discuss a number of issues, ranging from demographic trends and NPRPs to physician burnout and medical student recruitment, that figure to impact the radiology workforce now and in the near future.