Displaying publications 1 - 20 of 46 in total

Abstract:
Sort:
  1. Ali Z, Elamvazuthi I, Alsulaiman M, Muhammad G
    J Med Syst, 2016 Jan;40(1):20.
    PMID: 26531753 DOI: 10.1007/s10916-015-0392-2
    Voice disorders are associated with irregular vibrations of vocal folds. Based on the source filter theory of speech production, these irregular vibrations can be detected in a non-invasive way by analyzing the speech signal. In this paper we present a multiband approach for the detection of voice disorders given that the voice source generally interacts with the vocal tract in a non-linear way. In normal phonation, and assuming sustained phonation of a vowel, the lower frequencies of speech are heavily source dependent due to the low frequency glottal formant, while the higher frequencies are less dependent on the source signal. During abnormal phonation, this is still a valid, but turbulent noise of source, because of the irregular vibration, affects also higher frequencies. Motivated by such a model, we suggest a multiband approach based on a three-level discrete wavelet transformation (DWT) and in each band the fractal dimension (FD) of the estimated power spectrum is estimated. The experiments suggest that frequency band 1-1562 Hz, lower frequencies after level 3, exhibits a significant difference in the spectrum of a normal and pathological subject. With this band, a detection rate of 91.28 % is obtained with one feature, and the obtained result is higher than all other frequency bands. Moreover, an accuracy of 92.45 % and an area under receiver operating characteristic curve (AUC) of 95.06 % is acquired when the FD of all levels is fused. Likewise, when the FD of all levels is combined with 22 Multi-Dimensional Voice Program (MDVP) parameters, an improvement of 2.26 % in accuracy and 1.45 % in AUC is observed.
    Matched MeSH terms: Voice/physiology; Voice Disorders/diagnosis*; Voice Disorders/physiopathology*
  2. Nor Ashikin Rahman, Noor Azilah Muda, Norashikin Ahmad
    MyJurnal
    Combining Mel Frequency Cepstral Coefficient with wavelet transform for feature extraction is not new. This paper proposes a new architecture to help in increasing the accuracy of speaker recognition compared with conventional architecture. In conventional speaker model, the voice will undergo noise elimination first before feature extraction. The proposed architecture however, will extract the features and eliminate noise simultaneously. The MFCC is used to extract the voice features while wavelet de-noising technique is used to eliminate the noise contained in the speech signals. Thus, the new architecture achieves two outcomes in one single process: ex-tracting voice feature and elimination of noise.
    Matched MeSH terms: Voice
  3. Muthukumar P, Balasubramaniam P, Ratnavelu K
    ISA Trans, 2018 Nov;82:51-61.
    PMID: 28755926 DOI: 10.1016/j.isatra.2017.07.007
    This paper proposes a generalized robust synchronization method for different dimensional fractional order dynamical systems with mismatched fractional derivatives in the presence of function uncertainty and external disturbance by a designing sliding mode controller. Based on the proposed theory of generalized robust synchronization criterion, a novel audio cryptosystem is proposed for sending or sharing voice messages secretly via insecure channel. Numerical examples are given to verify the potency of the proposed theories.
    Matched MeSH terms: Voice
  4. Chan MY, Chu SY, Ahmad K, Ibrahim NM
    J Telemed Telecare, 2021 Apr;27(3):174-182.
    PMID: 31431134 DOI: 10.1177/1357633X19870913
    INTRODUCTION: Intensive voice therapy is one of the best evidence-based treatments to improve speech and voice difficulties to individuals with Parkinson's disease (PD). However, accessibility to intensive voice therapy is highly challenging in Malaysia due to the lack of voice specialised speech-language therapists. This study examined the feasibility of using smartphone videoconference to deliver intensive voice therapy to individuals with PD in Malaysia.

    METHODS: Intensive voice therapy was delivered to 11 adults with PD using a smartphone videoconference method via WhatsApp Messenger freeware. The therapy consisted of 12 sessions over four weeks and focused on increasing vocal loudness. Outcomes were assessed using objective, perceptual and quality-of-life measures pre and post treatment. Participant satisfaction with the telerehabilitation method was obtained via the Smartphone-Based Therapy Satisfaction Questionnaire.

    RESULTS: Significant gains were reported for sound pressure level in sustained vowels and monologue. Perceptual ratings showed significant improvements in overall mean severity and loudness after treatment. Mean scores of speech intelligibility and Voice Handicap Index-10 were significantly better post treatment. Overall, participants were highly satisfied with the smartphone videoconference method.

    DISCUSSION: Present results suggest that the smartphone videoconference method is feasible to deliver intensive voice therapy to individuals with PD to gain better speech and voice functions. Future studies need to address the standardisation of the system protocol to optimise this novel service delivery method in Malaysia.

    Matched MeSH terms: Voice Disorders*
  5. Mohd Khairuddin KA, Ahmad K, Mohd Ibrahim H, Yan Y
    J Voice, 2021 Jul;35(4):636-645.
    PMID: 31864891 DOI: 10.1016/j.jvoice.2019.12.005
    Despite its clear advantages, laryngeal high-speed videoendoscopy (LHSV) has not yet been accepted as a routine imaging tool for the evaluation of vocal fold vibration due to the unavailability of methods to effectively analyze the huge number of images from the LHSV recording. Recently, a promising LHSV-based analysis method has been introduced. The ability of this analysis method in studying the vocal fold vibratory behaviors had been substantially demonstrated. However, some practical aspects of its clinical applications still require further attention. Most fundamental is that the criteria for the measurement input ie, a segment of interest (SOI), which has not been fully defined. Particularly, the length of the SOI and the location along the sample, where it needs to be selected require further confirmation. Meanwhile, the analysis using any options of a well-delineated glottal area demands verification. Without clear criteria for the SOI, it is difficult to demonstrate the relevance of this analysis method in clinical voice assessment. Therefore, the aim of the present study is to establish the criteria for the SOI, which involved the investigations on the length of the SOI and the location along the sample, where it needs to be selected, as well as the use of any options of a well-delineated glottal area for analysis. The participants in the present study consisted of 36 young normophonic females. The methods involved LHSV recording of the images of the vibrating vocal folds. The captured images were then analyzed using the method. The LHSV-based measures from the analyses were compared according to the specified procedures of each investigation. Results indicated that 2000 frames should be used as the SOI length. The SOI could be selected at any location along the sample as long as well-delineated glottal areas were observed. With the current findings, a more conclusive measurement protocol is available to ensure reliable LHSV-based measures. The findings further support this analysis method for clinical application, which in turn promote LHSV as a reliable laryngeal imaging tool in clinical setting.
    Matched MeSH terms: Voice Quality*
  6. Moy FM, Hoe VC, Hairi NN, Chu AH, Bulgiba A, Koh D
    PLoS One, 2015;10(11):e0141963.
    PMID: 26540291 DOI: 10.1371/journal.pone.0141963
    OBJECTIVES: To establish the prevalence of voice disorder using the Malay-Voice Handicap Index 10 (Malay-VHI-10) and to study the determinants, quality of life, depression, anxiety and stress associated with voice disorder among secondary school teachers in Peninsular Malaysia.

    METHODS: This study was divided into two phases. Phase I tested the reliability of the Malay-VHI-10 while Phase II was a cross-sectional study with two-stage sampling. In Phase II, a self-administered questionnaire was used to collect socio-demographic and teaching characteristics, depression, anxiety and stress scale (Malay version of DASS-21); and health-related quality of life (Malay version of SF12-v2). Complex sample analysis was conducted using multivariate Poisson regression with robust variance.

    RESULTS: In Phase I, the Spearman correlation coefficient and Cronbach alpha for total VHI-10 score was 0.72 (p < 0.001) and 0.77 respectively; showing good correlation and internal consistency. The ICCs ranged from 0.65 to 0.78 showing fair to good reliability and demonstrating the subscales to be reliable and stable. A total of 6039 teachers participated in Phase II. They were primarily Malays, females, married, had completed tertiary education and aged between 30 to 50 years. A total of 10.4% (95% CI 7.1, 14.9) of the teachers had voice disorder (VHI-10 score > 11). Compared to Malays, a greater proportion of ethnic Chinese teachers reported voice disorder while ethnic Indian teachers were less likely to report this problem. There was a higher prevalence ratio (PR) of voice disorder among single or divorced/widowed teachers. Teachers with voice disorder were more likely to report higher rates of absenteeism (PR: 1.70, 95% CI 1.33, 2.19), lower quality of life with lower SF12-v2 physical (0.98, 95% CI 0.96, 0.99) and mental (0.97, 95% CI 0.96, 0.98) component summary scales; and higher anxiety levels (1.04, 95% CI 1.02, 1.06).

    CONCLUSIONS: The Malay-VHI-10 is valid and reliable. Voice disorder was associated with increased absenteeism, marginally associated with reduced health-related quality of life as well as increased anxiety among teachers.

    Matched MeSH terms: Voice/physiology; Voice Disorders/etiology*; Voice Disorders/epidemiology*; Voice Quality/physiology
  7. Mat Baki M, Wood G, Alston M, Ratcliffe P, Sandhu G, Rubin JS, et al.
    Clin Otolaryngol, 2015 Feb;40(1):22-8.
    PMID: 25263076 DOI: 10.1111/coa.12313
    OBJECTIVE: To evaluate the agreement between OperaVOX and MDVP.

    DESIGN: Cross sectional reliability study.

    SETTING: University teaching hospital.

    METHODS: Fifty healthy volunteers and 50 voice disorder patients had supervised recordings in a quiet room using OperaVOX by the iPod's internal microphone with sampling rate of 45 kHz. A five-seconds recording of vowel/a/was used to measure fundamental frequency (F0), jitter, shimmer and noise-to-harmonic ratio (NHR). All healthy volunteers and 21 patients had a second recording. The recorded voices were also analysed using the MDVP. The inter- and intrasoftware reliability was analysed using intraclass correlation (ICC) test and Bland-Altman (BA) method. Mann-Whitney test was used to compare the acoustic parameters between healthy volunteers and patients.

    RESULTS: Nine of 50 patients had severe aperiodic voice. The ICC was high with a confidence interval of >0.75 for the inter- and intrasoftware reliability except for the NHR. For the intersoftware BA analysis, excluding the severe aperiodic voice data sets, the bias (95% LOA) of F0, jitter, shimmer and NHR was 0.81 (11.32, -9.71); -0.13 (1.26, -1.52); -0.52 (1.68, -2.72); and 0.08 (0.27, -0.10). For the intrasoftware reliability, it was -1.48 (18.43, -21.39); 0.05 (1.31, -1.21); -0.01 (2.87, -2.89); and 0.005 (0.20, -0.18), respectively. Normative data from the healthy volunteers were obtained. There was a significant difference in all acoustic parameters between volunteers and patients measured by the Opera-VOX (P 

    Matched MeSH terms: Voice Disorders/diagnosis*; Voice Disorders/physiopathology*; Voice Quality/physiology*
  8. RoscellaInja, Abdul Rahman H
    MyJurnal
    Teachers face one of the highest demands of any professional group to use their voices at work. Thus, they are at
    higher risk of developing voice disorder than the general population. The consequences of voice disorder may have
    impact on teacher’s social and professional life as well as their mental, physical and emotional state and their
    ability to communicate. Objectives of this study are to determine the prevalence of voice disorder and the
    relationship between voice disorder with associated risk factors such as teaching activities and lifestyle factors
    among primary school teachers in Bintulu, Sarawak. A cross sectional study was conducted based on random sample
    of 4 primary schools in Bintulu, Sarawak between January-March 2014. A total of 100 full-time primary school
    teachers were invited to participate in the study. Data were collected through a self-administered questionnaire
    addressing the prevalence of voice disorder and potential risk factors. Descriptive analysis and chi-square test was
    used to measure the relationship between voice disorder and associated risk factors. The response rate for this study
    was 78% (78/100). The study found that the prevalence of voice disorder among primary school teachers in Bintulu,
    Sarawak was 13%. Chi-square test results revealed that factors significantly associated with voice disorder (p
    Matched MeSH terms: Voice Disorders*
  9. Kmil, D., Baesah, G., Dewi Mumi, M.Y.
    MyJurnal
    Flooding is the most frequent of all natural disasters. A flood is any water flow that exceeds the capacity of the drainage system and usually subsides in relatively shorter period. However, the flood that hit Batu Pahat District were different from other districts. Batu Pahat flooding extended for 48 days from the first wave until it subsided fully. It gives positive and negative effects not only to the victims but also to the health care workers (HCWs) while executing their duties during and post flood. This write up aims to share HCW’s experience and voices from those who were involved in the flood disaster. Methods used are brainstorming sessions, discussion, observation and interview. From this study, 10 main themes were highlighted. This flood disaster has given the HCWs to prepare mentally, physically and increase knowledge and skills to face any disaster in the future.
    Matched MeSH terms: Voice
  10. Al-Haddad, S.A.R., Samad, S.A., Hussain, A., Ishak, K.A., Noor, A.O.A.
    ASM Science Journal, 2008;2(1):75-81.
    MyJurnal
    Robustness is a key issue in speech recognition. A speech recognition algorithm for Malay digits from zero to nine and an algorithm for noise cancellation by using recursive least squares (RLS) is proposed in this article. This system consisted of speech processing inclusive of digit margin and recognition using zero crossing and energy calculations. Mel-frequency cepstral coefficient vectors were used to provide an estimate of the vocal tract filter. Meanwhile dynamic time warping was used to detect the nearest recorded voice with appropriate global constraint. The global constraint was used to set a valid search region because the variation of the speech rate of the speaker was considered to be limited in a reasonable range which meant that it could prune the unreasonable search space. The algorithm was tested on speech samples that were recorded as part of a Malay corpus. The results showed that the algorithm managed to recognize almost 80.5% of the Malay digits for all recorded words. The addition of a RLS noise canceller in the preprocessing stage increased the accuracy to 94.1%.
    Matched MeSH terms: Voice
  11. Cahyani NDW, Martini B, Choo KR, Ab Rahman NH, Ashman H
    J Forensic Sci, 2018 May;63(3):868-881.
    PMID: 28833117 DOI: 10.1111/1556-4029.13624
    Communication apps can be an important source of evidence in a forensic investigation (e.g., in the investigation of a drug trafficking or terrorism case where the communications apps were used by the accused persons during the transactions or planning activities). This study presents the first evidence-based forensic taxonomy of Windows Phone communication apps, using an existing two-dimensional Android forensic taxonomy as a baseline. Specifically, 30 Windows Phone communication apps, including Instant Messaging (IM) and Voice over IP (VoIP) apps, are examined. Artifacts extracted using physical acquisition are analyzed, and seven digital evidence objects of forensic interest are identified, namely: Call Log, Chats, Contacts, Locations, Installed Applications, SMSs and User Accounts. Findings from this study would help to facilitate timely and effective forensic investigations involving Windows Phone communication apps.
    Matched MeSH terms: Voice
  12. Ali Z, Alsulaiman M, Muhammad G, Elamvazuthi I, Al-Nasheri A, Mesallam TA, et al.
    J Voice, 2017 May;31(3):386.e1-386.e8.
    PMID: 27745756 DOI: 10.1016/j.jvoice.2016.09.009
    A large population around the world has voice complications. Various approaches for subjective and objective evaluations have been suggested in the literature. The subjective approach strongly depends on the experience and area of expertise of a clinician, and human error cannot be neglected. On the other hand, the objective or automatic approach is noninvasive. Automatic developed systems can provide complementary information that may be helpful for a clinician in the early screening of a voice disorder. At the same time, automatic systems can be deployed in remote areas where a general practitioner can use them and may refer the patient to a specialist to avoid complications that may be life threatening. Many automatic systems for disorder detection have been developed by applying different types of conventional speech features such as the linear prediction coefficients, linear prediction cepstral coefficients, and Mel-frequency cepstral coefficients (MFCCs). This study aims to ascertain whether conventional speech features detect voice pathology reliably, and whether they can be correlated with voice quality. To investigate this, an automatic detection system based on MFCC was developed, and three different voice disorder databases were used in this study. The experimental results suggest that the accuracy of the MFCC-based system varies from database to database. The detection rate for the intra-database ranges from 72% to 95%, and that for the inter-database is from 47% to 82%. The results conclude that conventional speech features are not correlated with voice, and hence are not reliable in pathology detection.
    Matched MeSH terms: Voice Disorders/diagnosis*; Voice Disorders/physiopathology; Voice Quality*
  13. Ali Z, Elamvazuthi I, Alsulaiman M, Muhammad G
    J Voice, 2016 Nov;30(6):757.e7-757.e19.
    PMID: 26522263 DOI: 10.1016/j.jvoice.2015.08.010
    BACKGROUND AND OBJECTIVE: Automatic voice pathology detection using sustained vowels has been widely explored. Because of the stationary nature of the speech waveform, pathology detection with a sustained vowel is a comparatively easier task than that using a running speech. Some disorder detection systems with running speech have also been developed, although most of them are based on a voice activity detection (VAD), that is, itself a challenging task. Pathology detection with running speech needs more investigation, and systems with good accuracy (ACC) are required. Furthermore, pathology classification systems with running speech have not received any attention from the research community. In this article, automatic pathology detection and classification systems are developed using text-dependent running speech without adding a VAD module.

    METHOD: A set of three psychophysics conditions of hearing (critical band spectral estimation, equal loudness hearing curve, and the intensity loudness power law of hearing) is used to estimate the auditory spectrum. The auditory spectrum and all-pole models of the auditory spectrums are computed and analyzed and used in a Gaussian mixture model for an automatic decision.

    RESULTS: In the experiments using the Massachusetts Eye & Ear Infirmary database, an ACC of 99.56% is obtained for pathology detection, and an ACC of 93.33% is obtained for the pathology classification system. The results of the proposed systems outperform the existing running-speech-based systems.

    DISCUSSION: The developed system can effectively be used in voice pathology detection and classification systems, and the proposed features can visually differentiate between normal and pathological samples.

    Matched MeSH terms: Voice Disorders/classification; Voice Disorders/diagnosis*; Voice Disorders/physiopathology; Voice Quality*
  14. Mirhassani SM, Zourmand A, Ting HN
    ScientificWorldJournal, 2014;2014:534064.
    PMID: 25006595 DOI: 10.1155/2014/534064
    Automatic estimation of a speaker's age is a challenging research topic in the area of speech analysis. In this paper, a novel approach to estimate a speaker's age is presented. The method features a "divide and conquer" strategy wherein the speech data are divided into six groups based on the vowel classes. There are two reasons behind this strategy. First, reduction in the complicated distribution of the processing data improves the classifier's learning performance. Second, different vowel classes contain complementary information for age estimation. Mel-frequency cepstral coefficients are computed for each group and single layer feed-forward neural networks based on self-adaptive extreme learning machine are applied to the features to make a primary decision. Subsequently, fuzzy data fusion is employed to provide an overall decision by aggregating the classifier's outputs. The results are then compared with a number of state-of-the-art age estimation methods. Experiments conducted based on six age groups including children aged between 7 and 12 years revealed that fuzzy fusion of the classifier's outputs resulted in considerable improvement of up to 53.33% in age estimation accuracy. Moreover, the fuzzy fusion of decisions aggregated the complementary information of a speaker's age from various speech sources.
    Matched MeSH terms: Voice/physiology
  15. Ahmad K, Yan Y, Bless DM
    J Voice, 2012 Mar;26(2):239-53.
    PMID: 21621975 DOI: 10.1016/j.jvoice.2011.02.001
    The purpose of the study was to investigate relationships between vocal fold vibrations and voice quality. Laryngeal images obtained from high-speed digital imaging (HSDI) were examined for their open-closed timing characteristics and perturbation values. A customized software delineated the glottal edges and used the Hilbert transform-based method of analysis to provide objective quantification of glottal perturbation. Overlay tracings of the transformed glottal cycles provided visual patterns on the overall vibratory dynamics. In this paper, we described the use of this method in looking at vibratory characteristics of a group of young female speakers (N=23). We found that, females with no voice complaints and who had been perceived to have normal voices were not a homogeneous group in terms of their glottal vibratory patterns during phonation. Their vibratory patterns showed characteristics similar to exemplar voices targeted to be clear (50%), pressed (27%), breathy (15%), or a mixed quality (8%). Perturbation range in terms of cycle-to-cycle frequency and amplitude was small and did not discriminate patterns. All these patterns yielded perceptually normal voices suggesting that in normal young speakers, the level of perturbation may be more important to the judgment than the actual pattern of closure.
    Matched MeSH terms: Voice*
  16. Wong EHC, Chong AW
    Am J Otolaryngol, 2019 12 05;41(2):102367.
    PMID: 31831185 DOI: 10.1016/j.amjoto.2019.102367
    BACKGROUND: Many studies have looked at the effect of functional endoscopic sinus surgeries (FESS) on nasalance, nasal consonant and nasalized vowels. Only two studies investigated the effect of FESS on vocal sound quality and have not found statistically significant changes before and after operations. The aim of this study was to examine the short-term and long-term objective and subjective changes in the vocal quality of patients after FESS, comparing patients with and without nasal polyps.

    METHODS: Sixteen patients were recruited for voice analysis during pre-operative, within two weeks and at least three months post-operatively. Subjective questionnaire was used to assess perception of voice changes.

    RESULTS: There were no statistically significant changes in the acoustic parameters of patients with nasal polyposis. In patients with CRS without polyps, there was a statistically significant increase in fundamental frequency (F0) in nasal sound during early follow up. The changes in soft phonation index (SPI) values between the two groups were statistically significant during early follow-ups. Only patients with nasal polyposis perceived a subjective change in their voice post-operatively.

    CONCLUSIONS: Clinicians should inform all patients, especially voice professionals about the possible effects of endoscopic sinus surgeries on their voice quality.

    Matched MeSH terms: Voice/physiology*
  17. Ramli MI, Hamzaid NA, Engkasan JP
    J Voice, 2019 Jul 09.
    PMID: 31300185 DOI: 10.1016/j.jvoice.2019.06.006
    OBJECTIVES: The aim of this study was to investigate the performance of mechanomyography (MMG) and electromyography (EMG) in monitoring the sternocleidomastoid (SCM) as accessory respiratory muscles when breathing during singing.

    METHODS: MMG and EMG were used to record the activity of the SCM in 32 untrained singers reciting a monotonous text and a standard folk song. Their voices were recorded and their pitch, or fundamental frequency (FF), and intensity were derived using Praat software. Instants of inhale and exhales were identified during singing from their voice recordings and the corresponding SCM MMG and EMG activities were analysed.

    RESULTS: The SCM MMG, and EMG signals during breathing while singing were significantly different than breathing at rest (p < 0.001). On the other hand, MMG was relatively better correlated to voice intensity in both reading and singing than EMG. EMG was better, but not significantly, correlated with FF in both reading and singing as compared to MMG.

    CONCLUSIONS: This study established MMG and EMG as the quantitative measurement tool to monitor breathing activities during singing. This is useful for applications related to singing therapy performance measure including potentially pathologically effected population. While the MMG and EMG could not distinguish FF and intensity significantly, it is useful to serve as a proxy of inhalation and exhalation levels throughout a particular singing session. Further studies are required to determine its efficacy in a therapeutic setting.

    Matched MeSH terms: Voice
  18. Lee ST, Niimi S
    J Laryngol Otol, 1990 Nov;104(11):876-8.
    PMID: 2266311
    Vocal fold sulcus is a cause of dysphonia which has not been recognized until recently. Awareness of its existence combined with use of laryngostroboscopy would enhance the management of this group of patients. Five such cases were treated initially by voice therapy and subsequently combined with microlaryngeal Teflon injections of the vocal cord. Representative photomicrographs and the end results of treatment are presented. A good voice, subjectively and objectively, was obtained in three patients, with satisfactory improvement in the other two.
    Matched MeSH terms: Voice Disorders/therapy; Voice Training
  19. Muhammad Afiq Mohd Aizam, Nor Shahanim Mohamad Hadis, Samihah Abdullah
    ESTEEM Academic Journal, 2020;16(1):59-73.
    MyJurnal
    Disabled persons usually require an assistant to help them in their daily routines especially for their mobility. The limitation of being physically impaired affects the quality of life in executing their daily routine especially the ones with a wheelchair. Pushing a wheelchair has its own side effects for the user especially the person with hands and arms impairments. This paper aims to develop a smart wheelchair system integrated with home automation. With the advent of the Internet of Things (IoT), a smart wheelchair can be operated using voice command through the Google assistant Software Development Kit (SDK). The smart wheelchair system and the home automation of this study were powered by Raspberry Pi 3 B+ and NodeMCU, respectively. Voice input commands were processed by the Google assistant Artificial Intelligence Yourself (AIY) to steer the movement of wheelchair. Users were able to speak to Google to discover any information from the website. For the safety of the user, a streaming camera was added on the wheelchair. An improvement to the wheelchair system that was added on the wheelchair is its combination with the home automation to help the impaired person to control their home appliances through Blynk application.
    Observations on three voice tones (low, medium and high) of voice command show that the minimum voice intensity for this smart wheelchair system is 68.2 dB. Besides, the user is also required to produce a clear voice command to increase the system accuracy.
    Matched MeSH terms: Voice
  20. Farah Nazlia Che Kassim, Muthusamy, Hariharan, Vijean, Vikneswaran, Zulkapli Abdullah, Rokiah Abdullah
    MyJurnal
    Voice pathology analysis has been one of the useful tools in the diagnosis of the pathological voice, as the method is non-invasive, inexpensive, and can reduce the time required for the analysis. This paper investigates feature extraction based on the Dual-Tree Complex Wavelet Packet Transform (DT-CWPT) using energy and entropy measures tested with two classifiers, k-Nearest Neighbors (k-NN) and Support Vector Machine (SVM). Massachusetts Eye and Ear Infirmary (MEEI) voice disorders database and Saarbruecken Voice Database (SVD) were used. Five datasets of voice samples were used from these databases, including normal and abnormal samples, Cysts, Vocal Nodules, Polyp, and Paralysis vocal fold. To the best of the authors’ knowledge, very few studies were done on multiclass classifications using specific pathology database. File-based and frame-based investigation for two-class and multiclass were considered. In the two-class analysis using the DT-CWPT with entropies, the classification accuracy of 100% and 99.94% was achieved for MEEI and SVD database respectively. Meanwhile, the classification accuracy for multiclass analysis comprised of 99.48% for the MEEI database and 99.65% for SVD database. The experimental results using the proposed features provided promising accuracy to detect the presence of diseases in vocal fold.
    Matched MeSH terms: Voice Disorders
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links