Quantitative evaluation of Saliency-Based Explainable artificial intelligence (XAI) methods in Deep Learning-Based mammogram analysis

Cerekci, Esma; Alis, Deniz; Denizoglu, Nurper; Camurdan, Ozden; Seker, Mustafa Ege; Ozer, Caner; Hansu, Muhammed Yusuf; Tanyel, Toygar; Oksuz, Ilkay; Karaarslan, Ercan

doi:10.1016/j.ejrad.2024.111356

Yayınlanmış 1 Ocak 2024 | Sürüm v1

Dergi makalesi Açık

Quantitative evaluation of Saliency-Based Explainable artificial intelligence (XAI) methods in Deep Learning-Based mammogram analysis

1. Sisli Hamidiye Etfal Training & Res Hosp, Dept Radiol, Istanbul, Turkiye
2. Acibadem Mehmet Ali Aydinlar Univ, Sch Med, Dept Radiol, Istanbul, Turkiye
3. Acibadem Healthcare Grp, Dept Radiol, Istanbul, Turkiye
4. Acibadem Mehmet Ali Aydinlar Univ, Sch Med, Istanbul, Turkiye
5. Istanbul Tech Univ, Dept Comp Engn, Istanbul, Turkiye
6. Istanbul Tech Univ, Dept Elect & Commun Engn, Istanbul, Turkiye
7. Istanbul Tech Univ, Dept Biomed Engn, Istanbul, Turkiye

Background: Explainable Artificial Intelligence (XAI) is prominent in the diagnostics of opaque deep learning (DL) models, especially in medical imaging. Saliency methods are commonly used, yet there's a lack of quantitative evidence regarding their performance. Objectives: To quantitatively evaluate the performance of widely utilized saliency XAI methods in the task of breast cancer detection on mammograms. Methods: Three radiologists drew ground-truth boxes on a balanced mammogram dataset of women (n = 1496 cancer-positive and negative scans) from three centers. A modified, pre-trained DL model was employed for breast cancer detection, using MLO and CC images. Saliency XAI methods, including Gradient-weighted Class Activation Mapping (Grad-CAM), Grad-CAM++, and Eigen-CAM, were evaluated. We utilized the Pointing Game to assess these methods, determining if the maximum value of a saliency map aligned with the bounding boxes, representing the ratio of correctly identified lesions among all cancer patients, with a value ranging from 0 to 1. Results: The development sample included 2,244 women (75%), with the remaining 748 women (25%) in the testing set for unbiased XAI evaluation. The model's recall, precision, accuracy, and F1-Score in identifying cancer in the testing set were 69%, 88%, 80%, and 0.77, respectively. The Pointing Game Scores for Grad-CAM, Grad-CAM++, and Eigen-CAM were 0.41, 0.30, and 0.35 in women with cancer and marginally increased to 0.41, 0.31, and 0.36 when considering only true-positive samples. Conclusions: While saliency-based methods provide some degree of explainability, they frequently fall short in delineating how DL models arrive at decisions in a considerable number of instances.

Dosyalar

bib-43b3b494-f2cc-4692-a97b-a36c0a3ec5eb.txt

Dosyalar (316 Bytes)

Ad	Boyut	Hepisini indir
bib-43b3b494-f2cc-4692-a97b-a36c0a3ec5eb.txt md5:3e4e5ad13a8e7b6eb0e1507d2fd2d85e	316 Bytes	Ön İzleme İndir

	Tüm sürümler	Bu sürüm
Görüntüleme	10	10
İndirilenler	6	6
Veri miktarı	1.9 kB	1.9 kB

Quantitative evaluation of Saliency-Based Explainable artificial intelligence (XAI) methods in Deep Learning-Based mammogram analysis

Dosyalar

bib-43b3b494-f2cc-4692-a97b-a36c0a3ec5eb.txt

Dosyalar (316 Bytes)

TÜBİTAK ULAKBİM

İLETİŞİM

Quantitative evaluation of Saliency-Based Explainable artificial intelligence (XAI) methods in Deep Learning-Based mammogram analysis

Oluşturanlar

Açıklama

Dosyalar

bib-43b3b494-f2cc-4692-a97b-a36c0a3ec5eb.txt

Dosyalar (316 Bytes)