Displaying publications 1 - 20 of 26 in total

Abstract:
Sort:
  1. Ahmad K, Yan Y, Bless D
    J Voice, 2012 Nov;26(6):751-9.
    PMID: 22633334 DOI: 10.1016/j.jvoice.2011.12.002
    A high proportion of the geriatric population suffers from presbylaryngis and presbyphonia; however, our knowledge of vibratory patterns in this population is almost nonexistent. In this study, we investigate the vocal fold vibratory patterns of healthy elderly females to determine which features or combination of them could best describe the geriatric voices.
  2. Ahmad K, Yan Y, Bless DM
    J Voice, 2012 Mar;26(2):239-53.
    PMID: 21621975 DOI: 10.1016/j.jvoice.2011.02.001
    The purpose of the study was to investigate relationships between vocal fold vibrations and voice quality. Laryngeal images obtained from high-speed digital imaging (HSDI) were examined for their open-closed timing characteristics and perturbation values. A customized software delineated the glottal edges and used the Hilbert transform-based method of analysis to provide objective quantification of glottal perturbation. Overlay tracings of the transformed glottal cycles provided visual patterns on the overall vibratory dynamics. In this paper, we described the use of this method in looking at vibratory characteristics of a group of young female speakers (N=23). We found that, females with no voice complaints and who had been perceived to have normal voices were not a homogeneous group in terms of their glottal vibratory patterns during phonation. Their vibratory patterns showed characteristics similar to exemplar voices targeted to be clear (50%), pressed (27%), breathy (15%), or a mixed quality (8%). Perturbation range in terms of cycle-to-cycle frequency and amplitude was small and did not discriminate patterns. All these patterns yielded perceptually normal voices suggesting that in normal young speakers, the level of perturbation may be more important to the judgment than the actual pattern of closure.
  3. Ting HN, Chia SY, Abdul Hamid B, Mukari SZ
    J Voice, 2011 Nov;25(6):e305-9.
    PMID: 21429707 DOI: 10.1016/j.jvoice.2010.05.007
    The acoustic characteristics of sustained vowel have been widely investigated across various languages and ethnic groups. These acoustic measures, including fundamental frequency (F(0)), jitter (Jitt), relative average perturbation (RAP), five-point period perturbation quotient (PPQ5), shimmer (Shim), and 11-point amplitude perturbation quotient (APQ11) are not well established for Malaysian Malay young adults. This article studies the acoustic measures of Malaysian Malay adults using acoustical analysis. The study analyzed six sustained Malay vowels of 60 normal native Malaysian Malay adults with a mean of 21.19 years. The F(0) values of Malaysian Malay males and females were reported as 134.85±18.54 and 238.27±24.06Hz, respectively. Malaysian Malay females had significantly higher F(0) than that of males for all the vowels. However, no significant differences were observed between the genders for the perturbation measures in all the vowels, except RAP in /e/. No significant F(0) differences between the vowels were observed. Significant differences between the vowels were reported for all perturbation measures in Malaysian Malay males. As for Malaysian Malay females, significant differences between the vowels were reported for Shim and APQ11. Multiethnic comparisons indicate that F(0) varies between Malaysian Malay and other ethnic groups. However, the perturbation measures cannot be directly compared, where the measures vary significantly across different speech analysis softwares.
  4. Ting HN, Zourmand A, Chia SY, Yong BF, Abdul Hamid B
    J Voice, 2012 Sep;26(5):664.e1-6.
    PMID: 22285457 DOI: 10.1016/j.jvoice.2011.08.008
    The formant frequencies of Malaysian Malay children have not been well studied. This article investigates the first four formant frequencies of sustained vowels in 360 Malay children aged between 7 and 12 years using acoustical analysis. Generally, Malay female children had higher formant frequencies than those of their male counterparts. However, no significant differences in all four formant frequencies were observed between the Malay male and female children in most of the vowels and age groups. Significant differences in all formant frequencies were found across the Malay vowels in both Malay male and female children for all age groups except for F4 in female children aged 12 years. Generally, the Malaysian Malay children showed a nonsystematic decrement in formant frequencies with age. Low levels of significant differences in formant frequencies were observed across the age groups in most of the vowels for F1, F3, and F4 in Malay male children and F1 and F4 in Malay female children.
  5. Ting HN, Chia SY, Manap HH, Ho AH, Tiu KY, Abdul Hamid B
    J Voice, 2012 Jul;26(4):425-30.
    PMID: 22243972 DOI: 10.1016/j.jvoice.2011.07.001
    The study is going to investigate the fundamental frequency (F(0)) and perturbation measures of sustained vowels in 360 native Malaysian Malay children aged between 7 and 12 years using acoustical analysis.
  6. Ting HN, Chia SY, Kim KS, Sim SL, Abdul Hamid B
    J Voice, 2011 Nov;25(6):e311-7.
    PMID: 21376529 DOI: 10.1016/j.jvoice.2010.05.004
    The acoustic properties of vowel phonation vary across cultures. These specific characteristics, including vowel fundamental frequency (F(0)) and perturbation measures (Absolute Jitter [Jita], Jitter [Jitt], Relative Average Perturbation [RAP], five-point Period Perturbation Quotient [PPQ5], Absolute Shimmer [ShdB], Shimmer [Shim], and 11-point Amplitude Perturbation Quotient [APQ11]) are not well established for Malaysian Chinese adults. This article investigates the F(0) and perturbation measurements of sustained vowels in 60 normal Malaysian Chinese adults using acoustical analysis. Malaysian Chinese females had significantly higher F(0) than Malaysian males in all six vowels. However, there were no significant differences in F(0) across the vowels for each gender. Significant differences between vowels were observed for Jita, Jitt, PPQ5, ShdB, Shim, and APQ11 among Chinese males, whereas significant differences between vowels were observed for all the perturbation parameters among Chinese females. Chinese males had significantly higher Jita and APQ11 in the vowels than Chinese females, whereas no significant differences were observed between males and females for Jitt, RAP, PPQ5, and Shim. Cross-ethnic comparisons indicate that F(0) of vowel phonation varies within the Chinese ethnic group and across other ethnic groups. The perturbation measures cannot be simply compared, where the measures may vary significantly across different speech analysis softwares.
  7. Zourmand A, Ting HN, Mirhassani SM
    J Voice, 2013 Mar;27(2):201-9.
    PMID: 23473455 DOI: 10.1016/j.jvoice.2012.12.006
    Speech is one of the prevalent communication mediums for humans. Identifying the gender of a child speaker based on his/her speech is crucial in telecommunication and speech therapy. This article investigates the use of fundamental and formant frequencies from sustained vowel phonation to distinguish the gender of Malay children aged between 7 and 12 years. The Euclidean minimum distance and multilayer perceptron were used to classify the gender of 360 Malay children based on different combinations of fundamental and formant frequencies (F0, F1, F2, and F3). The Euclidean minimum distance with normalized frequency data achieved a classification accuracy of 79.44%, which was higher than that of the nonnormalized frequency data. Age-dependent modeling was used to improve the accuracy of gender classification. The Euclidean distance method obtained 84.17% based on the optimal classification accuracy for all age groups. The accuracy was further increased to 99.81% using multilayer perceptron based on mel-frequency cepstral coefficients.
  8. Ali Z, Elamvazuthi I, Alsulaiman M, Muhammad G
    J Voice, 2016 Nov;30(6):757.e7-757.e19.
    PMID: 26522263 DOI: 10.1016/j.jvoice.2015.08.010
    BACKGROUND AND OBJECTIVE: Automatic voice pathology detection using sustained vowels has been widely explored. Because of the stationary nature of the speech waveform, pathology detection with a sustained vowel is a comparatively easier task than that using a running speech. Some disorder detection systems with running speech have also been developed, although most of them are based on a voice activity detection (VAD), that is, itself a challenging task. Pathology detection with running speech needs more investigation, and systems with good accuracy (ACC) are required. Furthermore, pathology classification systems with running speech have not received any attention from the research community. In this article, automatic pathology detection and classification systems are developed using text-dependent running speech without adding a VAD module.

    METHOD: A set of three psychophysics conditions of hearing (critical band spectral estimation, equal loudness hearing curve, and the intensity loudness power law of hearing) is used to estimate the auditory spectrum. The auditory spectrum and all-pole models of the auditory spectrums are computed and analyzed and used in a Gaussian mixture model for an automatic decision.

    RESULTS: In the experiments using the Massachusetts Eye & Ear Infirmary database, an ACC of 99.56% is obtained for pathology detection, and an ACC of 93.33% is obtained for the pathology classification system. The results of the proposed systems outperform the existing running-speech-based systems.

    DISCUSSION: The developed system can effectively be used in voice pathology detection and classification systems, and the proposed features can visually differentiate between normal and pathological samples.

  9. Al-Nasheri A, Muhammad G, Alsulaiman M, Ali Z, Mesallam TA, Farahat M, et al.
    J Voice, 2017 Jan;31(1):113.e9-113.e18.
    PMID: 27105857 DOI: 10.1016/j.jvoice.2016.03.019
    BACKGROUND AND OBJECTIVE: Automatic voice-pathology detection and classification systems may help clinicians to detect the existence of any voice pathologies and the type of pathology from which patients suffer in the early stages. The main aim of this paper is to investigate Multidimensional Voice Program (MDVP) parameters to automatically detect and classify the voice pathologies in multiple databases, and then to find out which parameters performed well in these two processes.

    MATERIALS AND METHODS: Samples of the sustained vowel /a/ of normal and pathological voices were extracted from three different databases, which have three voice pathologies in common. The selected databases in this study represent three distinct languages: (1) the Arabic voice pathology database; (2) the Massachusetts Eye and Ear Infirmary database (English database); and (3) the Saarbruecken Voice Database (German database). A computerized speech lab program was used to extract MDVP parameters as features, and an acoustical analysis was performed. The Fisher discrimination ratio was applied to rank the parameters. A t test was performed to highlight any significant differences in the means of the normal and pathological samples.

    RESULTS: The experimental results demonstrate a clear difference in the performance of the MDVP parameters using these databases. The highly ranked parameters also differed from one database to another. The best accuracies were obtained by using the three highest ranked MDVP parameters arranged according to the Fisher discrimination ratio: these accuracies were 99.68%, 88.21%, and 72.53% for the Saarbruecken Voice Database, the Massachusetts Eye and Ear Infirmary database, and the Arabic voice pathology database, respectively.

  10. Al-Yahya SN, Muhammad R, Suhaimi SNA, Azman M, Mohamed AS, Baki MM
    J Voice, 2020 Sep;34(5):811.e13-811.e20.
    PMID: 30612893 DOI: 10.1016/j.jvoice.2018.12.003
    OBJECTIVES: Selective laryngeal examination for patients undergoing thyroidectomy is recommended for patients with voice alterations, history of prior cervical or chest surgery, and patients with proven or suspected thyroid malignancy. The study objective is to measure the sensitivity of surgeons in detecting voice abnormalities in patients undergoing thyroidectomy, parathyroidectomy complicated with laryngeal nerve paralysis, or patients with known vocal cords palsy (VCP) due to other neck surgeries.

    DESIGN AND SETTING: Descriptive cross-sectional study in a tertiary center.

    PARTICIPANTS AND METHODS: The subjects are 274 audio files of voices of patients undergoing thyroid, parathyroid surgeries, and known VCP due to other neck surgeries. Voice assessments were done by three endocrine surgeons (A, B, and C) with 20, 12, and 4 years of surgical experience.

    MAIN OUTCOME MEASURES: Sensitivity and specificity of surgeon documented voice assessment in patients with underlying VCP. Subjects' acoustic analysis and Voice Handicap Index (VHI-10) were analyzed.

    RESULTS: Raters A, B, and C have sensitivity of 63.6%, 78.8%, and 66.7%, respectively. Inter-rater reliability shows substantial agreement (ƙ = 0.67). VHI-10 has sensitivity of 75.8% and strong correlation of 0.707 (p value <0.001) to VCP. Subjects with VCP have notably higher jitter, shimmer, and noise-to-harmonic ratio compared to normal subjects with sensitivity of 74.2%, 71.2%, and 72.7%, respectively.

    CONCLUSIONS: The results for surgeons documented voice assessment did not reach the desired sensitivity for a screening tool for patients with underlying VCP. Other tools such as VHI-10 and acoustic analysis may not be used as standalone tools in screening patients with underlying VCP. Routine preoperative laryngeal examination may be recommended for all patients undergoing thyroid, parathyroid, or other surgeries that places the laryngeal nerves at risk.

  11. Ramli MI, Hamzaid NA, Engkasan JP
    J Voice, 2019 Jul 09.
    PMID: 31300185 DOI: 10.1016/j.jvoice.2019.06.006
    OBJECTIVES: The aim of this study was to investigate the performance of mechanomyography (MMG) and electromyography (EMG) in monitoring the sternocleidomastoid (SCM) as accessory respiratory muscles when breathing during singing.

    METHODS: MMG and EMG were used to record the activity of the SCM in 32 untrained singers reciting a monotonous text and a standard folk song. Their voices were recorded and their pitch, or fundamental frequency (FF), and intensity were derived using Praat software. Instants of inhale and exhales were identified during singing from their voice recordings and the corresponding SCM MMG and EMG activities were analysed.

    RESULTS: The SCM MMG, and EMG signals during breathing while singing were significantly different than breathing at rest (p < 0.001). On the other hand, MMG was relatively better correlated to voice intensity in both reading and singing than EMG. EMG was better, but not significantly, correlated with FF in both reading and singing as compared to MMG.

    CONCLUSIONS: This study established MMG and EMG as the quantitative measurement tool to monitor breathing activities during singing. This is useful for applications related to singing therapy performance measure including potentially pathologically effected population. While the MMG and EMG could not distinguish FF and intensity significantly, it is useful to serve as a proxy of inhalation and exhalation levels throughout a particular singing session. Further studies are required to determine its efficacy in a therapeutic setting.

  12. Ong FM, Husna Nik Hassan NF, Azman M, Sani A, Mat Baki M
    J Voice, 2019 Jul;33(4):581.e17-581.e23.
    PMID: 29793874 DOI: 10.1016/j.jvoice.2018.01.015
    OBJECTIVES: This study aimed to determine the validity and reliability of Bahasa Malaysia version of Voice Handicap Index-10 (mVHI-10).

    MATERIALS AND METHODS: This cross-sectional study was carried out in the Otorhinolaryngology, Head and Neck Surgery Department of Universiti Kebangsaan Malaysia Medical Centre (UKMMC) from June 2015 to May 2016. The mVHI-10 was produced following a rigorous forward and backward translation. One hundred participants, including 50 healthy volunteers (17 male, 33 female) and 50 patients with voice disorders (26 male, 24 female), were recruited to complete the mVHI-10 before flexible laryngoscopic examinations and acoustic analysis. The mVHI-10 was repeated in 2 weeks via telephone interview or clinic visit. Its reliability and validity were assessed using interclass correlation.

    RESULTS: The test-retest reliability for total mVHI-10 and each item score was high, with the Cronbach alpha of >0.90. The total mVHI-10 score and domain scores were significantly higher (P 

  13. Mohd Khairuddin KA, Ahmad K, Mohd Ibrahim H, Yan Y
    J Voice, 2020 Aug 26.
    PMID: 32861565 DOI: 10.1016/j.jvoice.2020.07.036
    Facilitative playback-based subjective measures offer a more reliable evaluation of the vocal fold vibration than those derived from direct inspection of video playback. One of the measures is a Nyquist plot, which presents the analyzed cycle-to-cycle vibratory information in a graphical form. While the potential is evident, the information of the features of the Nyquist plot, which the evaluation is based on, is still incomplete. The current identified features and their vibratory behaviors may be inadequate to guarantee accurate interpretation of the findings. The present study aims to address this issue by examining the features of the Nyquist plot and their vibratory behaviors. A total of 56 young normophonic speakers, that is, 20 males and 36 females were recruited as the participants. Each of them underwent laryngeal high-speed videoendoscopy to record the images of the vocal fold vibration, which were then analyzed to generate the Nyquist plots. The features were identified by inspecting the properties of the plot points forming the Nyquist plots. For each identified feature, its vibratory behaviors were examined. The results revealed four features: rim contour depicting the longitudinal phase difference; left edge shape signifying the glottal configuration, phase closure, and closed phase duration; rim width and rim pattern visualizing the regularity of glottal areas and the regularity of the intracycle variations, respectively. The findings present a more complete reference of the features and their vibratory behaviors that is pertinent for the Nyquist plot interpretation.
  14. Al-Yahya SN, Mohamed Akram MHH, Vijaya Kumar K, Mat Amin SNA, Abdul Malik NA, Mohd Zawawi NA, et al.
    J Voice, 2020 Aug 27.
    PMID: 32861567 DOI: 10.1016/j.jvoice.2020.07.015
    OBJECTIVE: Maximum phonation time (MPT) is a test to measure glottic efficiency for laryngeal pathology screening and treatment monitoring. The normative value of MPT for South East Asia population has yet to be reported. It is postulated that MPT may be affected by body mass index (BMI) despite the paucity of evidence. Therefore, this study was designed to establish the normative value of MPT for a South East Asia population and investigate its relation to BMI.

    DESIGN & SETTING: This cross-sectional study was conducted in Universiti Kebangsaan Malaysia Medical Center between May and September 2017.

    PARTICIPANTS AND METHODS: Three hundred males and females with mean age of 30.23 (±11.04) years were recruited in equal number for each gender (n = 150) and divided into 3 groups of 50 according to their BMI (n = 50). The three groups are non-obese (BMI≤22.9kg/m2); obese (BMI between 23 and 34.9 kg/m2); and morbidly obese (BMI >35kg/m2). BMI and Voice Handicap Index-10 (VHI-10) were obtained. The average of three readings of MPT was measured using a stopwatch while the participants phonate /a/, /i/ and /u/. Unpaired t-test and ANOVA were used to compare means between and across groups. Spearman correlation assessed the correlation between MPT and BMI.

    MAIN OUTCOME MEASURES: The normative values of MPT of both genders and correlation with BMI were analyzed.

    RESULTS: The MPT normative values for males and females in the non-obese group were of 21.41 (±6.85) seconds and 18.05 (±5.06)seconds respectively for /a/. The MPT for all vowels were significantly higher in males across the BMI groups (P ≤ 0.05). There was low negative correlation between MPT and BMI in both genders.

    CONCLUSIONS: This pioneering study documented the normative values of MPT among Malaysians showed that males had longer MPT than females across the BMI groups. Obesity affects the MPT in that as BMI increases, the MPT decreases.

  15. Idrose AM, Juliana N, Azmani S, Yazit NAA, Muslim MSA, Ismail M, et al.
    J Voice, 2020 Jul 29.
    PMID: 32736909 DOI: 10.1016/j.jvoice.2020.06.031
    At high altitude, low oxygen partial pressure predisposes human body to hypobaric hypoxia that may lead to high-altitude illness. Currently, singing had been used for rehabilitation of patients with lung diseases but its role in high-altitude low oxygen environment is still scarce. This study aims to examine the effect of singing in improving oxygen saturation at different levels of high altitudes in a hypobaric chamber. Eight healthy volunteers were assigned to three interventions at three simulated altitudes (sea level, 3000 m and 5000 m). The oxygen saturation (SpO2) was measured via pulse oximetry under three conditions: no singing (NS), singing aloud (SA), and singing silently (SS). The "birthday song" was used as the standard song for 4 minutes. At sea level, mean NS SpO2 was 97.75% ± 1.04%. With SS, the level increased to 98.25% ± 1.04%. Mean SA SpO2 increased to 98.38% ± 0.92% (P < 0.05). At 3000 m, mean NS SpO2 was 92.75% ± 3.73% and rose to 94.50% ± 2.51% and 94.63% ± 2.00% respectively with SA and SS (P < 0.05). At 5000 m, NS level of 79.88P ± 3.60% increased to 82.13 ± 5.87 and 82.88% ± 7.12% with SA and SS respectively (P < 0.05). The repeated measure ANOVA showed significant difference for altitude (P < 0.001) and intervention (P = 0.05). In conclusion, singing both either "aloud" or "silently" significantly increased the level of SpO2 in simulated high altitude at 3000 m and above. The study suggests that singing as a potential intervention to improve oxygen saturation at high altitudes. Study with larger sample in hypobaric chamber as well as in real environment is recommended.
  16. Mohd Khairuddin KA, Ahmad K, Mohd Ibrahim H, Yan Y
    J Voice, 2021 Jul;35(4):636-645.
    PMID: 31864891 DOI: 10.1016/j.jvoice.2019.12.005
    Despite its clear advantages, laryngeal high-speed videoendoscopy (LHSV) has not yet been accepted as a routine imaging tool for the evaluation of vocal fold vibration due to the unavailability of methods to effectively analyze the huge number of images from the LHSV recording. Recently, a promising LHSV-based analysis method has been introduced. The ability of this analysis method in studying the vocal fold vibratory behaviors had been substantially demonstrated. However, some practical aspects of its clinical applications still require further attention. Most fundamental is that the criteria for the measurement input ie, a segment of interest (SOI), which has not been fully defined. Particularly, the length of the SOI and the location along the sample, where it needs to be selected require further confirmation. Meanwhile, the analysis using any options of a well-delineated glottal area demands verification. Without clear criteria for the SOI, it is difficult to demonstrate the relevance of this analysis method in clinical voice assessment. Therefore, the aim of the present study is to establish the criteria for the SOI, which involved the investigations on the length of the SOI and the location along the sample, where it needs to be selected, as well as the use of any options of a well-delineated glottal area for analysis. The participants in the present study consisted of 36 young normophonic females. The methods involved LHSV recording of the images of the vibrating vocal folds. The captured images were then analyzed using the method. The LHSV-based measures from the analyses were compared according to the specified procedures of each investigation. Results indicated that 2000 frames should be used as the SOI length. The SOI could be selected at any location along the sample as long as well-delineated glottal areas were observed. With the current findings, a more conclusive measurement protocol is available to ensure reliable LHSV-based measures. The findings further support this analysis method for clinical application, which in turn promote LHSV as a reliable laryngeal imaging tool in clinical setting.
  17. Ab Rani A, Azman M, Ubaidah MA, Mohamad Yunus MR, Sani A, Mat Baki M
    J Voice, 2021 May;35(3):487-492.
    PMID: 31732294 DOI: 10.1016/j.jvoice.2019.09.017
    OBJECTIVE: This study compared the voice outcomes of selected patients with unilateral vocal fold palsy (UVFP) who underwent either nonselective laryngeal reinnervation (LR) or Type 1 thyroplasty (thyroplasty) in a Malaysian tertiary centre using multidimensional voice assessments.

    PARTICIPANTS: The study included 16 patients with UVFP who underwent either LR (9 patients) or thyroplasty (7 patients) between 2015 and 2018 who fulfilled the inclusion criteria.

    MAIN OUTCOME MEASURES: The outcomes were measured subjectively and objectively with: (1) voice handicap index-10 (VHI-10- Malay version); (2) auditory perceptual evaluation using the breathiness component of Grade, Roughness, Breathiness, Asthenia, Strain scale; (3) maximum phonation time (MPT); and (4) acoustic analysis (jitter%, shimmer%, and NHR) using OperaVOXTM. The outcomes were measured at baseline, 6 and 12-months postoperative. The comparison of outcomes between pre and postoperative of each group was evaluated using one-way ANOVA test. Mann-Whitney test was used to compare the outcomes between the two groups.

    RESULTS: Comparison of each group at different time points showed significant improvement of VHI-10 and MPT of LR group between baseline and 12 months (P ≤ 0.05) whereas, the improvement in thyroplasty group was observed at all time points (P ≤ 0.05). When comparing between the two groups at 12 months, the VHI-10 and MPT was significantly better in the LR group than thyroplasty group with P = 0.004 and P = 0.001 respectively. Other outcome measures did not reveal significant difference between the two groups.

    CONCLUSION: This observational study showed that LR may be better than thyroplasty in improving VHI-10 and MPT in selected patients with UVFP.

  18. Ali Z, Alsulaiman M, Muhammad G, Elamvazuthi I, Al-Nasheri A, Mesallam TA, et al.
    J Voice, 2017 May;31(3):386.e1-386.e8.
    PMID: 27745756 DOI: 10.1016/j.jvoice.2016.09.009
    A large population around the world has voice complications. Various approaches for subjective and objective evaluations have been suggested in the literature. The subjective approach strongly depends on the experience and area of expertise of a clinician, and human error cannot be neglected. On the other hand, the objective or automatic approach is noninvasive. Automatic developed systems can provide complementary information that may be helpful for a clinician in the early screening of a voice disorder. At the same time, automatic systems can be deployed in remote areas where a general practitioner can use them and may refer the patient to a specialist to avoid complications that may be life threatening. Many automatic systems for disorder detection have been developed by applying different types of conventional speech features such as the linear prediction coefficients, linear prediction cepstral coefficients, and Mel-frequency cepstral coefficients (MFCCs). This study aims to ascertain whether conventional speech features detect voice pathology reliably, and whether they can be correlated with voice quality. To investigate this, an automatic detection system based on MFCC was developed, and three different voice disorder databases were used in this study. The experimental results suggest that the accuracy of the MFCC-based system varies from database to database. The detection rate for the intra-database ranges from 72% to 95%, and that for the inter-database is from 47% to 82%. The results conclude that conventional speech features are not correlated with voice, and hence are not reliable in pathology detection.
  19. Mohd Khairuddin KA, Ahmad K, Ibrahim HM, Yan Y
    J Voice, 2022 Jan;36(1):106-112.
    PMID: 32456835 DOI: 10.1016/j.jvoice.2020.04.027
    Ideally, an analysis method for laryngeal high-speed videoendoscopy (LHSV) based on the glottal area waveforms (GAW) requires images of a complete view of the glottis to ensure findings that are representatives of the vibratory behaviors of the whole vocal folds. However, in practice, the preferred images may not be obtained at all times. Often, the only available images that a clinician has to work with consist of a partial view of the glottis. This study aims to examine the effects of using images of a partial view of the glottis (ie, posterior-middle, anterior-middle, or middle) on the LHSV-based measures (ie, fundamental frequency (F0GAW), frequency perturbation (jitterGAW), amplitude perturbation (shimmerGAW), open quotient (OQGAW), and Nyquist plot). The participants consisted of 9 young normophonic females. The procedures involved LHSV recording of the vibration of the vocal folds. The images of the complete view of the glottis were analyzed to obtain the LHSV-based measures. The same images were used to simulate the images of partial views of the glottis by changing the outline of the region of interest to include only either the posterior-middle, anterior-middle, or middle parts of the glottis. The LHSV-based measures from the images of the partial views were then compared to those with the complete view . The results showed that all LHSV-based measures from the images of the posterior-middle view were similar to those of the complete view. However, only the F0GAW, jitterGAW, and shimmerGAW from the images of the anterior-middle and middle views were similar to those of the complete view. Lower OQGAW and different Nyquist plots than those of the complete view were generated by the images of the anterior-middle and middle views. In conclusion, all LHSV-based measures from the images of the posterior-middle view of the glottis, and only the F0GAW, jitterGAW, and shimmerGAW from the images of the anterior-middle and middle views of the glottis reflect the vibratory behaviors of the whole vocal folds. The same conclusion could not be applied to the OQGAW and Nyquist plots of the images of the anterior-middle and middle views of the glottis. A possible effect of the presence or absence of a posterior glottal gap on the findings warrants further confirmation.
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links