DESIGN: Cross sectional reliability study.
SETTING: University teaching hospital.
METHODS: Fifty healthy volunteers and 50 voice disorder patients had supervised recordings in a quiet room using OperaVOX by the iPod's internal microphone with sampling rate of 45 kHz. A five-seconds recording of vowel/a/was used to measure fundamental frequency (F0), jitter, shimmer and noise-to-harmonic ratio (NHR). All healthy volunteers and 21 patients had a second recording. The recorded voices were also analysed using the MDVP. The inter- and intrasoftware reliability was analysed using intraclass correlation (ICC) test and Bland-Altman (BA) method. Mann-Whitney test was used to compare the acoustic parameters between healthy volunteers and patients.
RESULTS: Nine of 50 patients had severe aperiodic voice. The ICC was high with a confidence interval of >0.75 for the inter- and intrasoftware reliability except for the NHR. For the intersoftware BA analysis, excluding the severe aperiodic voice data sets, the bias (95% LOA) of F0, jitter, shimmer and NHR was 0.81 (11.32, -9.71); -0.13 (1.26, -1.52); -0.52 (1.68, -2.72); and 0.08 (0.27, -0.10). For the intrasoftware reliability, it was -1.48 (18.43, -21.39); 0.05 (1.31, -1.21); -0.01 (2.87, -2.89); and 0.005 (0.20, -0.18), respectively. Normative data from the healthy volunteers were obtained. There was a significant difference in all acoustic parameters between volunteers and patients measured by the Opera-VOX (P
METHOD: A set of three psychophysics conditions of hearing (critical band spectral estimation, equal loudness hearing curve, and the intensity loudness power law of hearing) is used to estimate the auditory spectrum. The auditory spectrum and all-pole models of the auditory spectrums are computed and analyzed and used in a Gaussian mixture model for an automatic decision.
RESULTS: In the experiments using the Massachusetts Eye & Ear Infirmary database, an ACC of 99.56% is obtained for pathology detection, and an ACC of 93.33% is obtained for the pathology classification system. The results of the proposed systems outperform the existing running-speech-based systems.
DISCUSSION: The developed system can effectively be used in voice pathology detection and classification systems, and the proposed features can visually differentiate between normal and pathological samples.