Whatever the reason, when I got my performance report for last quarter’s radiology work, I found myself musing about the stats used to measure us.
Perhaps it’s because I’ve got baseball’s spring training on my mind, or I’ve still got a jumble of worthless football-related statistics rattling around in my head from last season. Whatever the reason, when I got my performance report for last quarter’s radiology work, I found myself musing about the stats used to measure us.
Accuracy rates, turnaround time, work volume - plenty of ways to stratify your radiologists. Or at least to rattle their cages and see who gets the most neurotic.
It occurred to me that in radiology, we’re incomplete with our statistics. All of our numbers pertain to the imaging department’s side of things. This is sort of like focusing on a baseball pitcher’s ERA without considering the batters he’s facing. Our batters, therefore, would be the clinicians who send us our cases. (You can make the clinicians the pitchers and us the batters if you prefer.) I think we’re missing out on a lot of 360-degree-review types of valuable feedback by not tracking their statistics along with our own.
To start, we could have a true-positive percentage (TPP), analogous to a batting average. Simply put, what percentage of imaging studies sent by a given clinician yields an abnormality? A nurse in the ER who ordered a gazillion abdominal CTs but only turned up pathology one time in 20, for instance, would have a TPP of .050. A seasoned diagnostician who only sent patients for imaging when he had a high index of suspicion, whose studies were abnormal one time in three, would have a TPP of .333 - a respectable number even for a major-leaguer.
For any who may take this too seriously, and worry that clinicians would be tempted to game the system and refrain from imaging patients to beef up their TPP (only sending rip-roaring sick patients for imaging and withholding studies from others, potentially missing pathology as a result), we could have a false-negative percentage (FNP).
Of greater interest and potential fun would be a stat analogous to baseball’s slugging percentage. Let’s call it the Clinical Prowess Quotient, or CPQ. For those unfamiliar, the slugging percentage is a measurement of not only how frequently one hits the ball, but how powerfully/effectively he does so.
Similarly, the CPQ would measure how accurately the provided clinical history for imaging matched the actual diagnosis. A history of “R/O Pathology” or “Pain” would be a zero, no matter what the study showed. On the other hand, a clinical history of “Acute appendicitis” for a scan that, indeed, showed an appy, would get 100 percent (batting 1.000, in baseball terms). The same patient, with a history of “right lower quadrant pain,” might yield a CPQ of .333; At least the clinician was accurate in terms of where the pathology was.
Understandably, some of these stats would have subjective elements. Then again, our own metrics in radiology aren’t free of such imperfection. One man’s “agree with original interpretation” is another’s “subtle discrepancy.” Heck, even the major leagues experience controversial disputes as to what was or was not a fielding error.
I’m sure that some of our clinical colleagues would bridle at being graded by us, directly or indirectly. Heck, I don’t think that many of us relish the feeling of being constantly watched and reassessed, but telling them that “turnabout is fair play” probably wouldn’t smooth any ruffled feathers. We might get better traction from the promise of bragging rights, or even a new set of credentials to flaunt on the job market: “Hey, we should really try to recruit Dr. X. I hear his CPQ is over .500!” Who knows, Vegas might start offering bets on this stuff.
New Study Examines Short-Term Consistency of Large Language Models in Radiology
November 22nd 2024While GPT-4 demonstrated higher overall accuracy than other large language models in answering ACR Diagnostic in Training Exam multiple-choice questions, researchers noted an eight percent decrease in GPT-4’s accuracy rate from the first month to the third month of the study.
FDA Grants Expanded 510(k) Clearance for Xenoview 3T MRI Chest Coil in GE HealthCare MRI Platforms
November 21st 2024Utilized in conjunction with hyperpolarized Xenon-129 for the assessment of lung ventilation, the chest coil can now be employed in the Signa Premier and Discovery MR750 3T MRI systems.
FDA Clears AI-Powered Ultrasound Software for Cardiac Amyloidosis Detection
November 20th 2024The AI-enabled EchoGo® Amyloidosis software for echocardiography has reportedly demonstrated an 84.5 percent sensitivity rate for diagnosing cardiac amyloidosis in heart failure patients 65 years of age and older.