METHODS: The study included 382 participants (252 normal voices and 130 dysphonic voices) in the proposed database MVPD. Complete data were obtained for both groups, including voice samples, laryngostroboscopy videos, and acoustic analysis. The diagnoses of patients with dysphonia were obtained. Each voice sample was anonymized using a code that was specific to each individual and stored in the MVPD. These voice samples were used to train and test the proposed OSELM algorithm. The performance of OSELM was evaluated and compared with other classifiers in terms of the accuracy, sensitivity, and specificity of detecting and differentiating dysphonic voices.
RESULTS: The accuracy, sensitivity, and specificity of OSELM in detecting normal and dysphonic voices were 90%, 98%, and 73%, respectively. The classifier differentiated between structural and non-structural vocal fold pathology with accuracy, sensitivity, and specificity of 84%, 89%, and 88%, respectively, while it differentiated between malignant and benign lesions with an accuracy, sensitivity, and specificity of 92%, 100%, and 58%, respectively. Compared to other classifiers, OSELM showed superior accuracy and sensitivity in detecting dysphonic voices, differentiating structural versus non-structural vocal fold pathology, and between malignant and benign voice pathology.
CONCLUSION: The OSELM algorithm exhibited the highest accuracy and sensitivity compared to other classifiers in detecting voice pathology, classifying between malignant and benign lesions, and differentiating between structural and non-structural vocal pathology. Hence, it is a promising artificial intelligence that supports an online application to be used as a screening tool to encourage people to seek medical consultation early for a definitive diagnosis of voice pathology.
MATERIALS AND METHODS: We developed a web-interface, hosted on a web server to collect oral lesions images from international partners. Further, we developed a customised annotation tool, also a web-interface for systematic annotation of images to build a rich clinically labelled dataset. We evaluated the sensitivities comparing referral decisions through the annotation process with the clinical diagnosis of the lesions.
RESULTS: The image repository hosts 2474 images of oral lesions consisting of oral cancer, oral potentially malignant disorders and other oral lesions that were collected through MeMoSA® UPLOAD. Eight-hundred images were annotated by seven oral medicine specialists on MeMoSA® ANNOTATE, to mark the lesion and to collect clinical labels. The sensitivity in referral decision for all lesions that required a referral for cancer management/surveillance was moderate to high depending on the type of lesion (64.3%-100%).
CONCLUSION: This is the first description of a database with clinically labelled oral lesions. This database could accelerate the improvement of AI algorithms that can promote the early detection of high-risk oral lesions.