Comparing Number and relevance Of false activations between two Artificial Intelligence CADe SystEms: the NOISE study

Spadaccini M; Hassan C; Alfarone L; Da Rio L; Maselli R; Carrara S; Galtieri PA; Pellegatta G; Fugazza A; Koleth G; Emmanuel J; Anderloni A; Mori Y; Wallace MB; Sharma P; Repici A

doi:10.1016/j.gie.2021.12.031

Comparing Number and relevance Of false activations between two Artificial Intelligence CADe SystEms: the NOISE study

Spadaccini M ¹ , Hassan C ² , Alfarone L ² , Da Rio L ² , Maselli R ² , Carrara S ³ Show all authors , Galtieri PA ³ , Pellegatta G ³ , Fugazza A ³ , Koleth G ⁴ , Emmanuel J ⁵ , Anderloni A ³ , Mori Y ⁶ , Wallace MB ⁷ , Sharma P ⁸ , Repici A ²

Affiliations

¹ Humanitas University, Department of Biomedical Sciences, Pieve Emanuele, Italy; Humanitas Clinical and Research Center -IRCCS-, Endoscopy Unit, Rozzano, Italy. Electronic address: marco.spadaccini@humanitas.it
² Humanitas University, Department of Biomedical Sciences, Pieve Emanuele, Italy; Humanitas Clinical and Research Center -IRCCS-, Endoscopy Unit, Rozzano, Italy
³ Humanitas University, Department of Biomedical Sciences, Pieve Emanuele, Italy
⁴ Hospital Selayang, Department of Gastroenterology and Hepatology, Selangor, Malaysia
⁵ Queen Elizabeth Hospital, Department of Gastroenterology and Hepatology, Sabah, Malaysia
⁶ Clinical Effectiveness Research Group, Institute of Health and Society, Faculty of Medicine, University of Oslo, Oslo, Norway; Digestive Disease Center, Showa University Northern Yokohama Hospital, Yokohama, Japan
⁷ Sheikh Shakhbout Medical City, Endoscopy Unit, Abu Dhabi, UAE
⁸ Kansas City VA Medical Center, Gastroenterology and Hepatology, Kansas City, United States

Gastrointest Endosc, 2022 Jan 04.

PMID: 34995639 DOI: 10.1016/j.gie.2021.12.031

Abstract

BACKGROUND AND AIMS: Artificial Intelligence (AI) has been shown to be effective in polyp detection, and multiple computer-aided detection (CADe) system have been developed. False positive (FP) activation emerged as a possible way to benchmark CADe performances in clinical practice. The aim of this study is to validate a previously developed classification of FP comparing the performances of different brands of approved CADe systems.

METHODS: We compared 2 different consecutive video libraries (40 video per arm) collected at Humanitas Research Hospital with 2 different CADe system brands (CADe A and CADe B). For each video, the number of CADe false activations, the cause and the time spent by the endoscopist to examine the area erroneously highlighted were reported. The FP activations were classified according to the previously developed classification of false positives (the NOISE classification) according to their cause and relevance.

RESULTS: A total of 1021 FP activations were registered across the 40 videos of the Group A (25.5±12.2 FPs per colonoscopy). A comparable number of FPs were identified in the Group B (n=1028, mean:25.7±13.2 FPs per colonoscopy) (p 0.53). Among them, 22.9±9.9 (89.8%, Group A), and 22.1±10.0 (86.0%, Group B) were due to artifacts from bowel wall. Conversely, 2.6±1.9 (10.2%) and 3.5±2.1 (14%) were caused by bowel content (p 0.45). Within the Group A each false activation required 0.2±0.9 seconds, with 1.6±1.0 (6.3%) FPs requiring additional time for endoscopic assessment. Comparable results were reported within the Group B with 0.2±0.8 seconds spent per false activation and 1.8±1.2 FPs per colonoscopy requiring additional inspection.

CONCLUSION: The use of a standardized nomenclature permitted to provide comparable results with either of the 2 recently approved CADe systems.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.

Similar publications