In a recently issued statement from multiple radiology societies including the RSNA and ACR, researchers offer practical advice for evaluating artificial intelligence (AI) tools, implementing AI into current workflows and monitoring of the technology to help ensure optimal benefit and effectiveness.
Calling artificial intelligence (AI) “the single most disruptive influence on radiology in many decades,” researchers on behalf of five leading radiology societies, including the American College of Radiology (ACR), the Radiological Society of North America (RSNA) and the European Society of Radiology (ESR), have published a multinational statement of practical considerations in assessing, implementing, and monitoring AI tools in radiology.
Simultaneously published in five different journals, including Radiology: Artificial Intelligence, Insights into Imaging, and the Journal of the American College of Radiology (JACR), the multi-society statement delves into potential biases with AI use, steps for assessing clinical accuracy, goal setting for monitoring of AI software, cost considerations and long-term viability.
Here are a few key takeaways from the statement.
1. The researchers emphasized diligent cost-benefit and return on investment (ROI) analyses in concert with the health-care setting and local circumstances when considering adjunctive AI tools or applications such as AI-enabled opportunistic screening. Tangible benefits of AI in outpatient imaging centers or fee-for-service hospital settings may range from an increased volume of findings that require follow-up exams or management to increased efficiency in emergency departments and shorter length of stays, according to the statement authors.
2. While the potential of AI to alleviate increasing workload burden amid a shortage of radiologists has been discussed, the researchers noted that reduced burnout and improved radiologist recruitment tend to be “additive” benefits that do not have as much measurable impact against the costs of AI implementation.
3. The researchers pointed out that AI implementations that only send AI results to an existing Picture Archiving and Communication System (PACS) are problematic due to the potential for automation bias for radiologists and a lack of knowledge for referring physicians as to the accuracy and other details of the AI model being utilized.
4. The statement authors emphasize the use of a system, such as a cloud-naïve environment, that enables radiologists to interact with and possibly modify AI results and share feedback with AI vendors.
“This type of interaction is facilitated in a cloud-naïve environment where both the PACS and AI models can share radiology data and AI results. Additionally, the ability to accept and store AI results along with radiologist feedback, optimize data security, and continuously monitor AI accuracy are crucial technical aspects that are facilitated in cloud-naïve systems,” wrote lead statement author Adrian Brady, M.D., president of the ESR and a clinical professor of radiology at University College Cork in Cork, Ireland, and colleagues.
Other Considerations With Reported Error Rates
5. Reported error rates with AI model testing “may differ substantially” from application in one’s practice, according to the statement authors. They emphasized consideration of differences with scanner manufacturers, protocols, disease prevalence and demographics of the local community in which the AI software is being deployed. Beyond error frequency, Brady and colleagues said those evaluating AI models should also detectability and correctability of errors with AI models, as well the potential impact of AI model errors on patients.
Targeted Implementation Of AI Software
6. Focusing implementation of AI models in health-care settings where disease prevalence is more pronounced may facilitate improved acceptance of the model in question, according to the statement authors.
“For example, pneumothorax (PTX) on chest X-ray (CXR) has a higher prevalence in the inpatient rather than the average population,” pointed out Brady and colleagues. “Limiting a PTX AI model to only inpatient CXRs will provide fewer false positive results and will more likely be accepted by the radiologists from an accuracy standpoint.”
Recognizing the Benefits of Continuous Monitoring with AI Models
7. The statement authors emphasized continuous monitoring of AI models and sharing of those assessments across multiple sites and geographic regions via an AI data registry. Doing so would allow registry participants to identify local issues contributing to AI model performance or more systematic problems tied to possible software updates.
“Hypothetically, analysis of the aggregate institutional registry data might show the poor performance to be limited to a single machine,” posited Brady and colleagues. “Further analysis might also show that the performance degradation occurred after a software upgrade to that machine or change in examination protocol.”
Contributing Factors That Can Affect Acceptance of AI Models
8. Identifying “Wow” cases can help facilitate stakeholder buy-in with AI models. The statement authors noted that cases demonstrating a significant impact on patient outcomes or operational efficiencies may show key examples of the potential impact of AI to stakeholders such as referring physicians and facility administrators.
9. Automation bias, algorithmic bias, and user-interface (UI) design may also factor into assessment and acceptance of AI models, according to the statement authors. In one study, the statement authors noted that text-only UI output outperformed radiologisr readers for pulmonary nodule detection while AI image overlays, often preferred by radiologists in this use context, did not enhance the performance of reviewing radiologists.
New Study Examines Short-Term Consistency of Large Language Models in Radiology
November 22nd 2024While GPT-4 demonstrated higher overall accuracy than other large language models in answering ACR Diagnostic in Training Exam multiple-choice questions, researchers noted an eight percent decrease in GPT-4’s accuracy rate from the first month to the third month of the study.
FDA Grants Expanded 510(k) Clearance for Xenoview 3T MRI Chest Coil in GE HealthCare MRI Platforms
November 21st 2024Utilized in conjunction with hyperpolarized Xenon-129 for the assessment of lung ventilation, the chest coil can now be employed in the Signa Premier and Discovery MR750 3T MRI systems.
FDA Clears AI-Powered Ultrasound Software for Cardiac Amyloidosis Detection
November 20th 2024The AI-enabled EchoGo® Amyloidosis software for echocardiography has reportedly demonstrated an 84.5 percent sensitivity rate for diagnosing cardiac amyloidosis in heart failure patients 65 years of age and older.