What would it be like if we could use AI-powered voice modes as filters for radiology dictations?
One of the formative experiences of my early teenage years was reading the Hitchhiker’s Guide to the Galaxy series. If I had previously seen sci-fi mixed with comedy at all, it had not been done anywhere nearly as well. No other book ever made me laugh so much. If you have only seen the movie adaptation, do yourself a favor and read the original. The movie barely scratched the surface.
The Sirius Cybernetics Corporation turns up repeatedly in the storyline, producing a healthy amount of the humor as their products never seem to quite operate as one might wish them to do. At some point, they incorporated “Genuine People Personalities” (GPP) into a bunch of their robots and computers. Essentially, instead of having a boring old machine that just performs its function, you get one that acts like a person.
Unfortunately, people can be downright unpleasant, and the personalities involved don’t come off as improvements for their products. One of the main characters, for instance, is Marvin, a bot with an outrageously overpowered brain. Nicknamed the Paranoid Android, he is overwhelmingly depressed, and more than a little dismissive/insulting to the comparatively stupid humans who give him his orders.
Life imitates art, and Grok (the AI put forth by Elon Musk) recently had multiple new “voice” modes unveiled. The current mode remains as a default, but you can now opt for voices like “storyteller,” “professor,” “Grok ‘Doc’,” “unlicensed therapist,” or “meditation.” If you want something wackier, you can go for “conspiracy” or even not suitable for work (NSFW) options like “sexy” and “unhinged.”
In other words, depending on what mode you choose, you can have the AI telling you things in an academic/wonky way, a roundabout approach that goes off on tangents, or a more philosophical bent. It can also flirt with you or curse a blue streak.
Unlike the Sirius Cybernetics products, Grok allows you to turn these modes off the moment you have had enough of them or switch to other modes as they seem suited to particular situations.
It took me less than a day to start thinking about using them as filters for radiology dictations.
To my disappointment, there isn’t (yet) a function where you can put text through a Grok “voice” translator. That kind of disappointed me since I recall many years ago when one could easily find Google Translate-like webpages capable of processing whatever text you wanted into other fun formats. For instance, the “En-Cheferizer” could turn anything you wrote into gibberish worthy of the Muppets’ Swedish Chef. There was also one for Snoop Dogg.
A between hither and yon step occurred in the world of GPS. I believe the brand was either Garmin or TomTom that first allowed you to have celebs like John Cleese narrating your driving directions. I am told that Waze expanded on that, including fictional characters like Darth Vader and Yoda.
For practical purposes, that can only go so far. Entertainment has its value, but ultimately you are supposedly using those devices to guide you somewhere you need to get. They have a limited repertoire of telling you to go this way or that, rather than AI’s theoretical ability to adapt to any particular situation. A GPS doesn’t need to be able to tell you the history of the 7-11 franchise just because you happen to be sitting near one while you wait for a traffic light to turn.
Whether or not you would use it for that particular purpose, an AI’s ability to do so is the sort of thing people want out of it. It will at least try to adapt to any novel situation to fulfill whatever has been asked. That is one of the reasons folks think it might eventually take away human jobs, including those of radiologists. Nobody knows what the software will be capable of doing.
Well before that happens, though, I could imagine rads using things like Grok’s “voice” modes to filter our reports and not just for proofreading. Alternatively, we could send our reports the way we normally do, and recipients could activate the filters.
Suppose, for instance, you are someone who really had no business ordering an advanced imaging study (but you did it anyway). You don’t understand the pathology involved, why one type of imaging is more appropriate than another, or what to do with the results. Now you need them spoon-fed to you.
You might just gratefully reach for the “professor” or “Grok Doc” filter, so AI could flesh out the report with explanations of everything in it. Sure, you could look up each item that confused you, but AI would do it all in one go. This is what a myelolipoma is, that is what azygous continuation of the inferior vena cava (IVC) is ... and now you know why nothing in the report requires further action from you. Heck, you might just be able to parrot what the AI told you (without saying where you picked up the info) and pretend to be a lot smarter than you are.
Rads could, of course, make use of these voice modes from our end. We generally aren’t supposed to exhibit our personalities through our reports. They are supposed to be as objective and soulless (“professional”) as possible. That includes when we are dying to express frustration, sarcasm, or other very human personality traits, no matter how circumstances tempt us, even when we are forced to render unnecessary addenda.
For instance, we might be tempted to say something snarky when we have gotten a chest CT to prove, once again, that a 1 mm lung nodule, which has been stable since 2008 is still unchanged. Another circumstance may involve a stone-cold normal report and a request to “make an addendum commenting on the appendix.”
If we give in to temptation and snark away, it will be pure trouble for us down the line, no matter how good it feels now. On the other hand, suppose we render our personality-free, professional report, and take a moment to put it through the “sarcastic condescending academic” voice filter. We can feast our eyes and briefly fantasize hitting the “sign” button before reverting it to normal.
Alternatively, we could borrow a trick from the ever-increasing number of folks outside of the medical world, when they have to write work-related messages, soulless things analogous to our rad reports. Write (or dictate) whatever you like, venting your spleen as you see fit. Then put it through a “professional voice” filter and watch all your objectionable statements get purged.
AI-Initiated Recalls After Screening Mammography Demonstrate Higher PPV for Breast Cancer
March 18th 2025While recalls initiated by one of two reviewing radiologists after screening mammography were nearly 10 percent higher than recalls initiated by an AI software, the AI-initiated recalls had an 85 percent higher positive predictive value for breast cancer, according to a new study.
The Reading Room: Artificial Intelligence: What RSNA 2020 Offered, and What 2021 Could Bring
December 5th 2020Nina Kottler, M.D., chief medical officer of AI at Radiology Partners, discusses, during RSNA 2020, what new developments the annual meeting provided about these technologies, sessions to access, and what to expect in the coming year.