Displaying publications 1 - 20 of 22 in total

Abstract:
Sort:
  1. Ting HN, Chia SY, Abdul Hamid B, Mukari SZ
    J Voice, 2011 Nov;25(6):e305-9.
    PMID: 21429707 DOI: 10.1016/j.jvoice.2010.05.007
    The acoustic characteristics of sustained vowel have been widely investigated across various languages and ethnic groups. These acoustic measures, including fundamental frequency (F(0)), jitter (Jitt), relative average perturbation (RAP), five-point period perturbation quotient (PPQ5), shimmer (Shim), and 11-point amplitude perturbation quotient (APQ11) are not well established for Malaysian Malay young adults. This article studies the acoustic measures of Malaysian Malay adults using acoustical analysis. The study analyzed six sustained Malay vowels of 60 normal native Malaysian Malay adults with a mean of 21.19 years. The F(0) values of Malaysian Malay males and females were reported as 134.85±18.54 and 238.27±24.06Hz, respectively. Malaysian Malay females had significantly higher F(0) than that of males for all the vowels. However, no significant differences were observed between the genders for the perturbation measures in all the vowels, except RAP in /e/. No significant F(0) differences between the vowels were observed. Significant differences between the vowels were reported for all perturbation measures in Malaysian Malay males. As for Malaysian Malay females, significant differences between the vowels were reported for Shim and APQ11. Multiethnic comparisons indicate that F(0) varies between Malaysian Malay and other ethnic groups. However, the perturbation measures cannot be directly compared, where the measures vary significantly across different speech analysis softwares.
    Matched MeSH terms: Speech Acoustics*
  2. Ting HN, Chia SY, Manap HH, Ho AH, Tiu KY, Abdul Hamid B
    J Voice, 2012 Jul;26(4):425-30.
    PMID: 22243972 DOI: 10.1016/j.jvoice.2011.07.001
    The study is going to investigate the fundamental frequency (F(0)) and perturbation measures of sustained vowels in 360 native Malaysian Malay children aged between 7 and 12 years using acoustical analysis.
    Matched MeSH terms: Speech Acoustics*
  3. Sinin Hamdan, Iran Amri Musoddiq, Ahmad Fauzi Musib, Marini Sawawi
    MyJurnal
    The tone of peking 1, 2, 3, 5, 6, 1’ was investigated using time-frequency analysis (TFA). The frequencies were measured using PicoScope oscilloscope, Melda analyzer in Cubase version 9 and Adobe version 3. Three different approaches for time-frequency analysis were used: Fourier spectra (using PicoScope), spectromorphology (using Melda analyzer) and spectrograms (using Adobe). Fourier spectra only identify intensity-frequency within entire signals, while spectromorphology identify the changes of intensity-frequency spectrum at fixed time and Adobe spectrograms identify the frequency with time. PicoScope reading produces the spectra of the fundamental and overtone frequencies in the entire sound. These overtones are non-harmonic since they are non-integral multiples of the fundamental. The fundamental frequencies of peking 1, 2, 3, 5, 6 were 1066Hz (C6), 1178Hz (D6), 1342Hz (E6), 1599Hz (G6) and 1793Hz (A6) respectively while peking 1’was 2123Hz (C7) i.e. one octave higher than peking 1. Melda analyzer reading proved that all peking sustained the initial fundamental frequency and overtone at t=0 until 2s. TFA from Adobe reading provides a description of the sound in the time-frequency plane. From TFA, peking 1, 2 and 6 exhibited a much gentler attack and more rapid decay than peking 3, 5 and 1’.
    Matched MeSH terms: Speech Acoustics
  4. Kaland C, Baumann S
    J Acoust Soc Am, 2020 04;147(4):2974.
    PMID: 32359299 DOI: 10.1121/10.0001008
    Phrase-level prosody serves two essential functions in many languages of the world: chunking information into units (demarcating) and marking important information (highlighting). Recent work suggests that prosody has a mainly demarcating function in the Trade Malay language family. That is, the use of pitch accents in these languages is limited or absent, as the main prosodic events occur on the final two syllables in a phrase. The current study investigates the extent to which Papuan Malay phrase prosody is used for demarcating and highlighting, taking into account the potential influence of word stress. This is done by means of acoustic analyses on a corpus of spontaneous speech. Both the form (F0 movement) and the possible functions (demarcating and highlighting) of the final two syllables in Papuan Malay phrases are investigated. Although most results favor the demarcating function of Papuan Malay phrase prosody, a highlighting function cannot be ruled out. The results suggest that Papuan Malay might hold an exceptional position in the typology of prosodic prominence.
    Matched MeSH terms: Speech Acoustics
  5. Ting HN, Chia SY, Kim KS, Sim SL, Abdul Hamid B
    J Voice, 2011 Nov;25(6):e311-7.
    PMID: 21376529 DOI: 10.1016/j.jvoice.2010.05.004
    The acoustic properties of vowel phonation vary across cultures. These specific characteristics, including vowel fundamental frequency (F(0)) and perturbation measures (Absolute Jitter [Jita], Jitter [Jitt], Relative Average Perturbation [RAP], five-point Period Perturbation Quotient [PPQ5], Absolute Shimmer [ShdB], Shimmer [Shim], and 11-point Amplitude Perturbation Quotient [APQ11]) are not well established for Malaysian Chinese adults. This article investigates the F(0) and perturbation measurements of sustained vowels in 60 normal Malaysian Chinese adults using acoustical analysis. Malaysian Chinese females had significantly higher F(0) than Malaysian males in all six vowels. However, there were no significant differences in F(0) across the vowels for each gender. Significant differences between vowels were observed for Jita, Jitt, PPQ5, ShdB, Shim, and APQ11 among Chinese males, whereas significant differences between vowels were observed for all the perturbation parameters among Chinese females. Chinese males had significantly higher Jita and APQ11 in the vowels than Chinese females, whereas no significant differences were observed between males and females for Jitt, RAP, PPQ5, and Shim. Cross-ethnic comparisons indicate that F(0) of vowel phonation varies within the Chinese ethnic group and across other ethnic groups. The perturbation measures cannot be simply compared, where the measures may vary significantly across different speech analysis softwares.
    Matched MeSH terms: Speech Acoustics*
  6. MARINA KAWI, DAYANG SARIAH ABANG SUHAI
    MyJurnal
    The study aims to identify an inventory of vowel phonemes of Melanau Rajang dialect in Belawai under the administration of Tanjung Manis District, Sarawak. This study is a field survey using interview methods to obtain data. A total 250 Swadesh list (Samarin, 1967) are used as a guide for data collection. In this study, two infotmants of different genders aged between 40 and 60 years old were selected based on criteria of informant selections according to Asmah Haji Omar (2001). In analysis data, qualitative method is used based on structural approaches. The findings show that there are eight (8) vowel phonemes have been identified; four (4) front vowels [i, e, ε, a]; one (1) central vowel [ə]; and three (3) back vowels [u, o, ɔ]. Besides that, the distribution/alternation of all vowel phonemes of Melanau Rajang dialect in Belawai are also discussed in this study. The findings also indicate that active vowel phonemes are vowels [a, i, u], while inactive vocal phonemes are vowels [ɔ, o, ε, ə, e].
    Matched MeSH terms: Speech Acoustics
  7. Ibrahim HM, Reilly S, Kilpatrick N
    Cleft Palate Craniofac J, 2012 Sep;49(5):e61-3.
    PMID: 21787239 DOI: 10.1597/11-001
    To establish normative nasalance scores for a set of newly developed stimuli in Malay.
    Matched MeSH terms: Speech Acoustics
  8. Kaland C, Gordon MK
    Phonetica, 2022 Jun 27;79(3):219-245.
    PMID: 35981718 DOI: 10.1515/phon-2022-2022
    The prosodic structure of under-researched languages in the Trade Malay language family is poorly understood. Although boundary marking has been uncontroversially shown as the major prosodic function in these languages, studies on the use of pitch accents to highlight important words in a phrase remain inconclusive. In addition, most knowledge of pitch accents is based on well-researched languages such as the ones from the Western-Germanic language family. This paper reports two word identification experiments comparing Papuan Malay with the pitch accent language American English, in order to investigate the extent to which the demarcating and highlighting function of prosody can be disentangled. To this end, target words were presented to native listeners of both languages and differed with respect to their position in the phrase (medial or final) and the shape of their f0 movement (original or manipulated). Reaction times for the target word identifications revealed overall faster responses for original and final words compared to manipulated and medial ones. The results add to previous findings on the facilitating effect of pitch accents and further improve our prosodic knowledge of underresearched languages.
    Matched MeSH terms: Speech Acoustics
  9. Mat Baki M, Wood G, Alston M, Ratcliffe P, Sandhu G, Rubin JS, et al.
    Clin Otolaryngol, 2015 Feb;40(1):22-8.
    PMID: 25263076 DOI: 10.1111/coa.12313
    OBJECTIVE: To evaluate the agreement between OperaVOX and MDVP.

    DESIGN: Cross sectional reliability study.

    SETTING: University teaching hospital.

    METHODS: Fifty healthy volunteers and 50 voice disorder patients had supervised recordings in a quiet room using OperaVOX by the iPod's internal microphone with sampling rate of 45 kHz. A five-seconds recording of vowel/a/was used to measure fundamental frequency (F0), jitter, shimmer and noise-to-harmonic ratio (NHR). All healthy volunteers and 21 patients had a second recording. The recorded voices were also analysed using the MDVP. The inter- and intrasoftware reliability was analysed using intraclass correlation (ICC) test and Bland-Altman (BA) method. Mann-Whitney test was used to compare the acoustic parameters between healthy volunteers and patients.

    RESULTS: Nine of 50 patients had severe aperiodic voice. The ICC was high with a confidence interval of >0.75 for the inter- and intrasoftware reliability except for the NHR. For the intersoftware BA analysis, excluding the severe aperiodic voice data sets, the bias (95% LOA) of F0, jitter, shimmer and NHR was 0.81 (11.32, -9.71); -0.13 (1.26, -1.52); -0.52 (1.68, -2.72); and 0.08 (0.27, -0.10). For the intrasoftware reliability, it was -1.48 (18.43, -21.39); 0.05 (1.31, -1.21); -0.01 (2.87, -2.89); and 0.005 (0.20, -0.18), respectively. Normative data from the healthy volunteers were obtained. There was a significant difference in all acoustic parameters between volunteers and patients measured by the Opera-VOX (P 

    Matched MeSH terms: Speech Acoustics*
  10. Röper KM, Scheumann M, Wiechert AB, Nathan S, Goossens B, Owren MJ, et al.
    Am J Primatol, 2014 Feb;76(2):192-201.
    PMID: 24123122 DOI: 10.1002/ajp.22221
    The endangered proboscis monkey (Nasalis larvatus) is a sexually highly dimorphic Old World primate endemic to the island of Borneo. Previous studies focused mainly on its ecology and behavior, but knowledge of its vocalizations is limited. The present study provides quantified information on vocal rate and on the vocal acoustics of the prominent calls of this species. We audio-recorded vocal behavior of 10 groups over two 4-month periods at the Lower Kinabatangan Wildlife Sanctuary in Sabah, Borneo. We observed monkeys and recorded calls in evening and morning sessions at sleeping trees along riverbanks. We found no differences in the vocal rate between evening and morning observation sessions. Based on multiparametric analysis, we identified acoustic features of the four common call-types "shrieks," "honks," "roars," and "brays." "Chorus" events were also noted in which multiple callers produced a mix of vocalizations. The four call-types were distinguishable based on a combination of fundamental frequency variation, call duration, and degree of voicing. Three of the call-types can be considered as "loud calls" and are therefore deemed promising candidates for non-invasive, vocalization-based monitoring of proboscis monkeys for conservation purposes.
    Matched MeSH terms: Speech Acoustics*
  11. Ting HN, Zourmand A, Chia SY, Yong BF, Abdul Hamid B
    J Voice, 2012 Sep;26(5):664.e1-6.
    PMID: 22285457 DOI: 10.1016/j.jvoice.2011.08.008
    The formant frequencies of Malaysian Malay children have not been well studied. This article investigates the first four formant frequencies of sustained vowels in 360 Malay children aged between 7 and 12 years using acoustical analysis. Generally, Malay female children had higher formant frequencies than those of their male counterparts. However, no significant differences in all four formant frequencies were observed between the Malay male and female children in most of the vowels and age groups. Significant differences in all formant frequencies were found across the Malay vowels in both Malay male and female children for all age groups except for F4 in female children aged 12 years. Generally, the Malaysian Malay children showed a nonsystematic decrement in formant frequencies with age. Low levels of significant differences in formant frequencies were observed across the age groups in most of the vowels for F1, F3, and F4 in Malay male children and F1 and F4 in Malay female children.
    Matched MeSH terms: Speech Acoustics*
  12. Zourmand A, Ting HN, Mirhassani SM
    J Voice, 2013 Mar;27(2):201-9.
    PMID: 23473455 DOI: 10.1016/j.jvoice.2012.12.006
    Speech is one of the prevalent communication mediums for humans. Identifying the gender of a child speaker based on his/her speech is crucial in telecommunication and speech therapy. This article investigates the use of fundamental and formant frequencies from sustained vowel phonation to distinguish the gender of Malay children aged between 7 and 12 years. The Euclidean minimum distance and multilayer perceptron were used to classify the gender of 360 Malay children based on different combinations of fundamental and formant frequencies (F0, F1, F2, and F3). The Euclidean minimum distance with normalized frequency data achieved a classification accuracy of 79.44%, which was higher than that of the nonnormalized frequency data. Age-dependent modeling was used to improve the accuracy of gender classification. The Euclidean distance method obtained 84.17% based on the optimal classification accuracy for all age groups. The accuracy was further increased to 99.81% using multilayer perceptron based on mel-frequency cepstral coefficients.
    Matched MeSH terms: Speech Acoustics*
  13. Fraundorf SH, Watson DG, Benjamin AS
    Psychol Aging, 2012 Mar;27(1):88-98.
    PMID: 21639646 DOI: 10.1037/a0024138
    In two experiments, we investigated age-related changes in how prosodic pitch accents affect memory. Participants listened to recorded discourses that contained two contrasts between pairs of items (e.g., one story contrasted British scientists with French scientists and Malaysia with Indonesia). The end of each discourse referred to one item from each pair; these references received a pitch accent that either denoted contrast (L + H* in the ToBI system) or did not (H*). A contrastive accent on a particular pair improved later recognition memory equally for young and older adults. However, older adults showed decreased memory if the other pair received a contrastive accent (Experiment 1). Young adults with low working memory performance also showed this penalty (Experiment 2). These results suggest that pitch accents guide processing resources to important information for both older and younger adults but diminish memory for less important information in groups with reduced resources, including older adults.
    Matched MeSH terms: Speech Acoustics*
  14. Mustafa MB, Salim SS, Mohamed N, Al-Qatab B, Siong CE
    PLoS One, 2014;9(1):e86285.
    PMID: 24466004 DOI: 10.1371/journal.pone.0086285
    Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data.
    Matched MeSH terms: Speech Acoustics
  15. Phoon HS, Abdullah AC, Maclagan M
    Int J Speech Lang Pathol, 2012 Dec;14(6):487-98.
    PMID: 23039125 DOI: 10.3109/17549507.2012.719549
    This study investigates the effect of dialect on phonological analyses in Chinese-influenced Malaysian English (ChME) speaking children. A total of 264 typically-developing ChME speaking children aged 3-7 years participated in this cross-sectional study. A single word naming task consisting of 195 words was used to elicit speech from the children. The samples obtained were transcribed phonetically and analysed descriptively and statistically. Phonological analyses were completed for speech sound accuracy, age of consonant acquisition, percentage of phonological process occurrence, and age of suppression for phonological processes. All these measurements differed based on whether or not ChME dialectal features were considered correct, with children gaining higher scores when ChME dialect features were considered correct. The findings of the present study provide guidelines for Malaysian speech-language pathologists and stress the need to appropriately consider ChME dialectal features in the phonological analysis of ChME speaking children. They also highlight the issues in accurate differential diagnosis of speech impairment for speech-language pathologists working with children from any linguistically diverse background.
    Matched MeSH terms: Speech Acoustics*
  16. Mustafa MB, Ainon RN
    J Acoust Soc Am, 2013 Oct;134(4):3057-66.
    PMID: 24116440 DOI: 10.1121/1.4818741
    The ability of speech synthesis system to synthesize emotional speech enhances the user's experience when using this kind of system and its related applications. However, the development of an emotional speech synthesis system is a daunting task in view of the complexity of human emotional speech. The more recent state-of-the-art speech synthesis systems, such as the one based on hidden Markov models, can synthesize emotional speech with acceptable naturalness with the use of a good emotional speech acoustic model. However, building an emotional speech acoustic model requires adequate resources including segment-phonetic labels of emotional speech, which is a problem for many under-resourced languages, including Malay. This research shows how it is possible to build an emotional speech acoustic model for Malay with minimal resources. To achieve this objective, two forms of initialization methods were considered: iterative training using the deterministic annealing expectation maximization algorithm and the isolated unit training. The seed model for the automatic segmentation is a neutral speech acoustic model, which was transformed to target emotion using two transformation techniques: model adaptation and context-dependent boundary refinement. Two forms of evaluation have been performed: an objective evaluation measuring the prosody error and a listening evaluation to measure the naturalness of the synthesized emotional speech.
    Matched MeSH terms: Speech Acoustics*
  17. Ali Z, Alsulaiman M, Muhammad G, Elamvazuthi I, Al-Nasheri A, Mesallam TA, et al.
    J Voice, 2017 May;31(3):386.e1-386.e8.
    PMID: 27745756 DOI: 10.1016/j.jvoice.2016.09.009
    A large population around the world has voice complications. Various approaches for subjective and objective evaluations have been suggested in the literature. The subjective approach strongly depends on the experience and area of expertise of a clinician, and human error cannot be neglected. On the other hand, the objective or automatic approach is noninvasive. Automatic developed systems can provide complementary information that may be helpful for a clinician in the early screening of a voice disorder. At the same time, automatic systems can be deployed in remote areas where a general practitioner can use them and may refer the patient to a specialist to avoid complications that may be life threatening. Many automatic systems for disorder detection have been developed by applying different types of conventional speech features such as the linear prediction coefficients, linear prediction cepstral coefficients, and Mel-frequency cepstral coefficients (MFCCs). This study aims to ascertain whether conventional speech features detect voice pathology reliably, and whether they can be correlated with voice quality. To investigate this, an automatic detection system based on MFCC was developed, and three different voice disorder databases were used in this study. The experimental results suggest that the accuracy of the MFCC-based system varies from database to database. The detection rate for the intra-database ranges from 72% to 95%, and that for the inter-database is from 47% to 82%. The results conclude that conventional speech features are not correlated with voice, and hence are not reliable in pathology detection.
    Matched MeSH terms: Speech Acoustics*
  18. Ooi CC, Wong AM
    Int J Speech Lang Pathol, 2012 Dec;14(6):499-508.
    PMID: 23039126 DOI: 10.3109/17549507.2012.712159
    One reason why specific language impairment (SLI) is grossly under-identified in Malaysia is the absence of locally- developed norm-referenced language assessment tools for its multilingual and multicultural population. Spontaneous language samples provide quantitative information for language assessment, and useful descriptive information on child language development in complex language and cultural environments. This research consisted of two studies and investigated the use of measures obtained from English conversational samples among bilingual Chinese-English Malaysian preschoolers. The research found that the language sample measures were sensitive to developmental changes in this population and could identify SLI. The first study examined the relationship between age and mean length of utterance (MLU(w)), lexical diversity (D), and the index of productive syntax (IPSyn) among 52 typically-developing (TD) children aged between 3;4-6;9. Analyses showed a significant linear relationship between age and D (r = .450), the IPsyn (r = .441), and MLU(w) (r = .318). The second study compared the same measures obtained from 10 children with SLI, aged between 3;8-5;11, and their age-matched controls. The children with SLI had significantly shorter MLU(w) and lower IPSyn scores than the TD children. These findings suggest that utterance length and syntax production can be potential clinical markers of SLI in Chinese-English Malaysian children.
    Matched MeSH terms: Speech Acoustics*
  19. Ali Z, Elamvazuthi I, Alsulaiman M, Muhammad G
    J Voice, 2016 Nov;30(6):757.e7-757.e19.
    PMID: 26522263 DOI: 10.1016/j.jvoice.2015.08.010
    BACKGROUND AND OBJECTIVE: Automatic voice pathology detection using sustained vowels has been widely explored. Because of the stationary nature of the speech waveform, pathology detection with a sustained vowel is a comparatively easier task than that using a running speech. Some disorder detection systems with running speech have also been developed, although most of them are based on a voice activity detection (VAD), that is, itself a challenging task. Pathology detection with running speech needs more investigation, and systems with good accuracy (ACC) are required. Furthermore, pathology classification systems with running speech have not received any attention from the research community. In this article, automatic pathology detection and classification systems are developed using text-dependent running speech without adding a VAD module.

    METHOD: A set of three psychophysics conditions of hearing (critical band spectral estimation, equal loudness hearing curve, and the intensity loudness power law of hearing) is used to estimate the auditory spectrum. The auditory spectrum and all-pole models of the auditory spectrums are computed and analyzed and used in a Gaussian mixture model for an automatic decision.

    RESULTS: In the experiments using the Massachusetts Eye & Ear Infirmary database, an ACC of 99.56% is obtained for pathology detection, and an ACC of 93.33% is obtained for the pathology classification system. The results of the proposed systems outperform the existing running-speech-based systems.

    DISCUSSION: The developed system can effectively be used in voice pathology detection and classification systems, and the proposed features can visually differentiate between normal and pathological samples.

    Matched MeSH terms: Speech Acoustics*
  20. Chu SY, Barlow SM, Lee J, Wang J
    Int J Speech Lang Pathol, 2017 12;19(6):616-627.
    PMID: 28425760 DOI: 10.1080/17549507.2016.1265587
    PURPOSE: This research characterised perioral muscle reciprocity and amplitude ratio in lower lip during bilabial syllable production [pa] at three rates to understand the neuromotor dynamics and scaling of motor speech patterns in individuals with Parkinson's disease (PD).

    METHOD: Electromyographic (EMG) signals of the orbicularis oris superior [OOS], orbicularis oris inferior [OOI] and depressor labii inferioris [DLI] were recorded during syllable production and expressed as polar-phase notations.

    RESULT: PD participants exhibited the general features of reciprocity between OOS, OOI and DLI muscles as reflected in the EMG during syllable production. The control group showed significantly higher integrated EMG amplitude ratio in the DLI:OOS muscle pairs than PD participants. No speech rate effects were found in EMG muscle reciprocity and amplitude magnitude across all muscle pairs.

    CONCLUSION: Similar patterns of muscle reciprocity in PD and controls suggest that corticomotoneuronal output to the facial nucleus and respective perioral muscles is relatively well-preserved in our cohort of mild idiopathic PD participants. Reduction of EMG amplitude ratio among PD participants is consistent with the putative reduction in the thalamocortical activation characteristic of this disease which limits motor cortex drive from generating appropriate commands which contributes to bradykinesia and hypokinesia of the orofacial mechanism.

    Matched MeSH terms: Speech Acoustics*
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links