Affiliations 

  • 1 Department of Mechatronics and Biomedical Engineering, Faculty of Engineering and Science, Lee Kong Chian, Universiti Tunku Abdul Rahman, Kampar, Malaysia
  • 2 Department of Mechatronics and Biomedical Engineering, Faculty of Engineering and Science, Lee Kong Chian, Universiti Tunku Abdul Rahman, Kampar, Malaysia. humyc@utar.edu.my
  • 3 Department of Electrical and Electronic Engineering, Faculty of Engineering and Science, Lee Kong Chian, Universiti Tunku Abdul Rahman, Kampar, Malaysia
  • 4 Department of Electronic Engineering, Faculty of Engineering and Green Technology, Universiti Tunku Abdul Rahman, 31900, Kampar, Malaysia
  • 5 Department of Computer Science, Electrical and Space Engineering, Lulea University of Technology, Lulea, Sweden
  • 6 School of Electronics Engineering, Vellore Institute of Technology, Amaravati, AP, India
  • 7 Department of Biomedical Engineering, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
Sci Rep, 2023 Nov 22;13(1):20518.
PMID: 37993544 DOI: 10.1038/s41598-023-46619-6

Abstract

Debates persist regarding the impact of Stain Normalization (SN) on recent breast cancer histopathological studies. While some studies propose no influence on classification outcomes, others argue for improvement. This study aims to assess the efficacy of SN in breast cancer histopathological classification, specifically focusing on Invasive Ductal Carcinoma (IDC) grading using Convolutional Neural Networks (CNNs). The null hypothesis asserts that SN has no effect on the accuracy of CNN-based IDC grading, while the alternative hypothesis suggests the contrary. We evaluated six SN techniques, with five templates selected as target images for the conventional SN techniques. We also utilized seven ImageNet pre-trained CNNs for IDC grading. The performance of models trained with and without SN was compared to discern the influence of SN on classification outcomes. The analysis unveiled a p-value of 0.11, indicating no statistically significant difference in Balanced Accuracy Scores between models trained with StainGAN-normalized images, achieving a score of 0.9196 (the best-performing SN technique), and models trained with non-normalized images, which scored 0.9308. As a result, we did not reject the null hypothesis, indicating that we found no evidence to support a significant discrepancy in effectiveness between stain-normalized and non-normalized datasets for IDC grading tasks. This study demonstrates that SN has a limited impact on IDC grading, challenging the assumption of performance enhancement through SN.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.