PMID: 19163841 DOI: 10.1109/IEMBS.2008.4650338

Abstract

Software technology enables computerized analysis to offer second opinion in various screening and diagnostic tasks to assist the clinicians. Yet, the performance of these computerized methods for medical images is questioned by experts in CAD research, owing to the use of different databases and criteria for evaluating the computer results for comparison. This paper intends to substantiate this statement by illustrating the effects of such issues with the use of 1D physiologic data and multiple databases. For this purpose, the detection of desaturation events in Sp02 and spike events in EEG are used. This is the first time that comparison between different algorithms on a common basis is carried out on an individual effort. The appraisal for all the algorithms is made on the same databases and criteria. It is surprising to find that issues for 2/3D images concur with those found in 1D data here. In evaluating the accuracy of a new algorithm, a single independent database gives results fast. This paper reveals weaknesses of such an approach. It is hoped that the supportive evidence shown here is enough for researchers to innovate a better platform for credibility in reporting performance comparison of computerized analysis algorithms.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.