Incorrect AI Advice May Affect Accuracy of Mammogram Readings

If an AI support system offers incorrect advice, it may affect how accurately radiologists read mammograms, no matter how experienced they are.

Published on May 7, 2023

 

If an artificial intelligence (AI) decision support system offers incorrect advice to a radiologist reading mammograms, it could seriously affect the accuracy of the readings, according to a small study. The experience level of the radiologist didn’t matter.

The research was published on May 2, 2023, by the journal Radiology. Read the abstract of “Automation Bias in Mammography: The Impact of Artificial Intelligence BI-RADS Suggestions on Reader Performance.”

AI and mammograms

You’ve probably heard of AI being used for everything from creating social media posts to using face recognition technology to unlock your phone. 

To read mammograms, technicians input information from millions of cases. The AI technology creates a mathematical representation — an algorithm — of what a normal mammogram looks like and what a mammogram with cancer looks like. The AI system looks at each mammogram image in more detail than the human eye can, and checks each image against the standards to find any abnormalities.

AI-based mammogram support systems are often considered a second set of eyes for radiologists reading mammograms and are considered one of the most promising applications for AI in radiology.

Still, there are concerns that radiologists may be susceptible to automation bias — the tendency of people to favor suggestions from automated decision-making systems. Several studies have shown that introducing computer-aided detection into the mammography workflow could affect how well radiologists perform. But no studies — until this one — have looked at how AI-based systems influence the accuracy of mammogram readings by radiologists.

About the study

The researchers wanted to see how automation bias can affect radiologists with different levels of expertise who use an AI system to help them read mammograms.

The researchers had 27 radiologists read 50 mammograms. Each radiologist then assigned a BI-RADS category to the mammogram with the help of an AI system.

BI-RADS stand for Breast Imaging Reporting and Database System. Radiologists in the United States and some other countries use BI-RADS to report mammogram findings to ensure the findings are reported in a standardized way.

There are six BI-RADS categories:

  • Category 0: more imaging needed before a category can be assigned

  • Category 1: no noticeable abnormality

  • Category 2: benign (non-cancerous) finding, such as benign calcifications

  • Category 3: probably benign, but should be rechecked before the next annual mammogram

  • Category 4: suspicious abnormality that could be cancer

  • Category 5: findings look like and probably are cancer

  • Category 6: findings on the mammogram have been proven to be cancer by a biopsy

The researchers gave the radiologists the mammograms they read in two sets:

  • The first set of 10 mammograms was a training set and the AI system suggested the correct BI-RADS category for all of them.

  • In the second set of 40 mammograms, the AI system supposedly suggested incorrect BI-RADS categories in 12 of the mammograms.

The researchers also noted the experience of the radiologists:

  • 11 were inexperienced radiologists with zero to two months of experience reading mammograms

  • 11 were moderately experienced radiologists with about 13 months of experience reading mammograms

  • five were very experienced radiologists with about 11 years of experience reading mammograms

The results showed the radiologists — no matter their level of experience — were more likely to assign the incorrect BI-RADS category to mammograms when the AI system supposedly suggested the incorrect category:

  • Inexperienced radiologists assigned the correct BI-RADS category to 79.7% of the mammograms when the AI suggested the correct category, but only to 19.8% of the mammograms when the AI suggested the incorrect category.

  • Moderately experienced radiologists assigned the correct BI-RADS category to 81.3% of the mammograms when AI suggested the correct category, but only to 24.8% of the mammograms when the AI suggested the incorrect category.

  • Very experienced radiologists assigned the correct BI-RADS category to 82.3% of the mammograms when AI suggested the correct category, but only to 45.5% of mammograms when the AI suggested the incorrect category.

“We anticipated that inaccurate AI predictions would influence the decisions made by radiologists in our study, particularly those with less experience,” study lead author Thomas Dratsch, MD, PhD, of the Institute of Diagnostic and Interventional Radiology at University Hospital Cologne in Cologne, Germany, said in a statement. “Nonetheless, it was surprising to find that even highly experienced radiologists were adversely impacted by the AI system’s judgments, albeit to a lesser extent than their less seasoned counterparts.

“Given the repetitive and highly standardized nature of mammography screening, automation bias may become a concern when an AI system is integrated into the workflow,” Dr. Dratsch continued. “Our findings emphasize the need for implementing appropriate safeguards when incorporating AI into the radiological process to mitigate the negative consequences of automation bias.”

What this means for you

The results of this study are concerning. Still, in both the research paper and an accompanying editorial in Radiology, the researchers and another radiology scientist noted there are strategies that can eliminate the effect of automation bias in AI-assisted mammogram reading.

Strategy One: Radiologists should receive ongoing training in the AI tools they are using. This helps them understand the strengths and limitations of the technology and make more informed decisions when using AI to help read mammograms.

Strategy Two: Radiologists should be accountable for the decisions they make when reading mammograms. This could include benchmarking their overall performance and providing continuous feedback.

Strategy Three: The algorithm the AI system uses should be transparent and continuously tested and refined in real-world settings.

Strategy Four: The AI system should be used as a triage system that runs in the background. AI would offer results only in specific situations when the results would be expected to be beneficial.

In future studies, the researchers plan to use tools such as eye-tracking technology to better understand the decision-making process of radiologists using AI. They also want to investigate the best ways of presenting AI information to radiologists so they can use the information effectively and without automation bias.

Share your feedback
Help us learn how we can improve our research news coverage.