Displaying publications 1 - 20 of 34 in total

Abstract:
Sort:
  1. Kalashnikova M, Singh L, Tsui A, Altuntas E, Burnham D, Cannistraci R, et al.
    Dev Sci, 2024 May;27(3):e13459.
    PMID: 37987377 DOI: 10.1111/desc.13459
    We report the findings of a multi-language and multi-lab investigation of young infants' ability to discriminate lexical tones as a function of their native language, age and language experience, as well as of tone properties. Given the high prevalence of lexical tones across human languages, understanding lexical tone acquisition is fundamental for comprehensive theories of language learning. While there are some similarities between the developmental course of lexical tone perception and that of vowels and consonants, findings for lexical tones tend to vary greatly across different laboratories. To reconcile these differences and to assess the developmental trajectory of native and non-native perception of tone contrasts, this study employed a single experimental paradigm with the same two pairs of Cantonese tone contrasts (perceptually similar vs. distinct) across 13 laboratories in Asia-Pacific, Europe and North-America testing 5-, 10- and 17-month-old monolingual (tone, pitch-accent, non-tone) and bilingual (tone/non-tone, non-tone/non-tone) infants. Across the age range and language backgrounds, infants who were not exposed to Cantonese showed robust discrimination of the two non-native lexical tone contrasts. Contrary to this overall finding, the statistical model assessing native discrimination by Cantonese-learning infants failed to yield significant effects. These findings indicate that lexical tone sensitivity is maintained from 5 to 17 months in infants acquiring tone and non-tone languages, challenging the generalisability of the existing theoretical accounts of perceptual narrowing in the first months of life. RESEARCH HIGHLIGHTS: This is a multi-language and multi-lab investigation of young infants' ability to discriminate lexical tones. This study included data from 13 laboratories testing 5-, 10-, and 17-month-old monolingual (tone, pitch-accent, non-tone) and bilingual (tone/non-tone, non-tone/non-tone) infants. Overall, infants discriminated a perceptually similar and a distinct non-native tone contrast, although there was no evidence of a native tone-language advantage in discrimination. These results demonstrate maintenance of tone discrimination throughout development.
    Matched MeSH terms: Phonetics
  2. Lim A, O'Brien B, Onnis L
    Behav Res Methods, 2024 Mar;56(3):1283-1313.
    PMID: 37553536 DOI: 10.3758/s13428-023-02094-5
    Research on orthographic consistency in English words has selectively identified different sub-syllabic units in isolation (grapheme, onset, vowel, coda, rime), yet there is no comprehensive assessment of how these measures affect word identification when taken together. To study which aspects of consistency are more psychologically relevant, we investigated their independent and composite effects on human reading behavior using large-scale databases. Study 1 found effects on adults' naming responses of both feedforward consistency (orthography to phonology) and feedback consistency (phonology to orthography). Study 2 found feedback but no feedforward consistency effects on visual and auditory lexical decision tasks, with the best predictor being a composite measure of consistency across grapheme, rime, OVC, and word-initial letter-phoneme. In Study 3, we explicitly modeled the reading process with forward and backward flow in a bidirectionally connected neural network. The model captured latent dimensions of quasi-regular mapping that explain additional variance in human reading and spelling behavior, compared to the established measures. Together, the results suggest interactive activation between phonological and orthographic word representations. They also validate the role of computational analyses of language to better understand how print maps to sound, and what properties of natural language affect reading complexity.
    Matched MeSH terms: Phonetics*
  3. Anis FN, Umat C, Ahmad K, Abdul Hamid B
    Cochlear Implants Int, 2022 Nov;23(6):347-357.
    PMID: 36005236 DOI: 10.1080/14670100.2022.2114583
    OBJECTIVE: This study aimed to compare the error patterns of Arabic phoneme-grapheme correspondence by a group of Malay children with cochlear implants (CIs) and normal hearing (NH) and the effects of the visual graphical features of Arabic graphemes (no-dot, single-dot, and multiple-dots) on the phoneme-grapheme correspondence.

    METHODS: Participants were matched for hearing age (Mean, M = 7 ± 1.03 years) and duration of exposure to Arabic sounds (M = 2.7 ± 1.2 years). All 28 Arabic phonemes were presented through a loudspeaker and participants pointed to the graphemes associated with the presented phonemes.

    RESULTS: A total of 336 and 616 tokens were collected for six children with CI and 11 NH children for each task, i.e., phonemes repetition and phoneme-grapheme correspondence. Both groups found it easier to repeat phonemes than the phoneme-grapheme correspondence. The children with CIs showed more confusion ([ظ, ز, ذ, ض, خ, ب, ه, س, ع, & ث] >10% correct scores) in phoneme-grapheme correspondence than the NH children ([ظ:14%] and [ث: 27%]). There was a significant interaction (p = 0.001) among the three visual graphical features and hearing status (CI and NH).

    CONCLUSION: Our results infer that non-native Malay children with CIs and NH use different strategies to process the Arabic graphemes' visual features for phoneme-grapheme correspondence.

    Matched MeSH terms: Phonetics
  4. Kaland C, Gordon MK
    Phonetica, 2022 Jun 27;79(3):219-245.
    PMID: 35981718 DOI: 10.1515/phon-2022-2022
    The prosodic structure of under-researched languages in the Trade Malay language family is poorly understood. Although boundary marking has been uncontroversially shown as the major prosodic function in these languages, studies on the use of pitch accents to highlight important words in a phrase remain inconclusive. In addition, most knowledge of pitch accents is based on well-researched languages such as the ones from the Western-Germanic language family. This paper reports two word identification experiments comparing Papuan Malay with the pitch accent language American English, in order to investigate the extent to which the demarcating and highlighting function of prosody can be disentangled. To this end, target words were presented to native listeners of both languages and differed with respect to their position in the phrase (medial or final) and the shape of their f0 movement (original or manipulated). Reaction times for the target word identifications revealed overall faster responses for original and final words compared to manipulated and medial ones. The results add to previous findings on the facilitating effect of pitch accents and further improve our prosodic knowledge of underresearched languages.
    Matched MeSH terms: Phonetics
  5. Kaland C, Kluge A, van Heuven VJ
    Phonetica, 2021 04 27;78(2):141-168.
    PMID: 33892529 DOI: 10.1515/phon-2021-2003
    The existence of word stress in Indonesian languages has been controversial. Recent acoustic analyses of Papuan Malay suggest that this language has word stress, counter to other studies and unlike closely related languages. The current study further investigates Papuan Malay by means of lexical (non-acoustic) analyses of two different aspects of word stress. In particular, this paper reports two distribution analyses of a word corpus, 1) investigating the extent to which stress patterns may help word recognition and 2) exploring the phonological factors that predict the distribution of stress patterns. The facilitating role of stress patterns in word recognition was investigated in a lexical analysis of word embeddings. The results show that Papuan Malay word stress (potentially) helps to disambiguate words. As for stress predictors, a random forest analysis investigated the effect of multiple morpho-phonological factors on stress placement. It was found that the mid vowels /ɛ/ and /ɔ/ play a central role in stress placement, refining the conclusions of previous work that mainly focused on /ɛ/. The current study confirms that non-acoustic research on stress can complement acoustic research in important ways. Crucially, the combined findings on stress in Papuan Malay so far give rise to an integrated perspective to word stress, in which phonetic, phonological and cognitive factors are considered.
    Matched MeSH terms: Phonetics*
  6. Mohd Ibrahim H, Lim HW, Ahmad Rusli Y, Lim CT
    Clin Linguist Phon, 2020 06 02;34(6):554-565.
    PMID: 31537131 DOI: 10.1080/02699206.2019.1668480
    This study was designed to develop language-specific stimuli for the assessment of resonance and to obtain nasalance scores using the newly developed speech stimuli in Mandarin. Gender and age influences on nasalance scores for each of the stimulus were also examined. Participants recruited were typically developing Mandarin-speaking ethnic Chinese children aged 6;00-7;11 growing up in Malaysia. Perceptual ratings of nasality were made based on the GOS.SP.ASS.'98 (revised) for children while nasalance scores were recorded for each stimulus using the Nasometer II (Model 6400). Fifty Mandarin-speaking children (24 males and 26 females) were recruited. None of the participants were perceived with abnormal nasality on the three stimuli. The mean nasalance scores for the Mandarin stimuli were 16.08% (SD = 2.57, 95% CI = 15.35-16.81) for the Oral passage, 25.20% (SD = 3.63, 95% CI = 24.17-26.23) for the Oral-Nasal passage and 55.44% (SD = 4.17, 95% CI = 54.25-56.63) for the Nasal passage. No significant age- and gender-related differences were observed for all the three stimuli. This is the first set of Mandarin stimuli and nasalance norms for Mandarin-speaking children in Malaysia. The influence of phonetic content on nasalance is supported. Findings call for language-specific normative nasalance data and careful selection of stimuli for the assessment of resonance.
    Matched MeSH terms: Phonetics*
  7. Turagam N, Mudrakola DP, Yelamanchi RS, Deepthi M, Natarajan M
    J Int Soc Prev Community Dent, 2019 02 14;9(1):94-98.
    PMID: 30923701 DOI: 10.4103/jispcd.JISPCD_220_18
    Denture esthetics as defined by Glossary of prosthodontics terms the effect produced by a dental prosthesis that affects the beauty and attractiveness of the person. [1] Removable partial dentures (RPDs) are the widely accepted and treatment of choice for most cases as it is both effective and affordable. Partially edentulous treatment planning includes both esthetics and masticatory function. A prosthesis that is highly esthetic will improve patient's motivation and acceptance. It is a very wrong notion to expect that patients will tolerate unesthetic partial dentures because good masticatory capability has been achieved. Esthetics plays a vital role in the success of partial dentures, and the length and mobility of the patient's lips play a significant role in achieving it. [2] Patients with short lips or highly mobile lips pose problems as esthetics are compromised because most clasp arms, denture borders, and other components will show when the patient smiles or speaks. [3] RPDs can easily look artificial; hence, special emphasis should aim toward restoring function, phonetics, esthetics with a long-term benefits which requires meticulous attention during fabrication. This case reports is an esthetic clasp designed for a cast partial denture for a young girl for esthetic and function.
    Matched MeSH terms: Phonetics
  8. Anis FN, Umat C, Ahmad K, Hamid BA
    Cochlear Implants Int, 2019 01;20(1):12-22.
    PMID: 30293522 DOI: 10.1080/14670100.2018.1530420
    OBJECTIVE: This study examined the patterns of recognition of Arabic consonants, via information transmission analysis for phonological features, in a group of Malay children with normal hearing (NH) and cochlear implants (CI).

    METHOD: A total of 336 and 616 acoustic tokens were collected from six CI and 11 NH Malay children, respectively. The groups were matched for hearing age and duration of exposure to Arabic sounds. All the 28 Arabic consonants in the form of consonant-vowel /a/ were presented randomly twice via a loudspeaker at approximately 65 dB SPL. The participants were asked to repeat verbally the stimulus heard in each presentation.

    RESULTS: Within the native Malay perceptual space, the two groups responded differently to the Arabic consonants. The dispersed uncategorized assimilation in the CI group was distinct in the confusion matrix (CM), as compared to the NH children. Consonants /ħ/, /tˁ/, /sˁ/ and /ʁ/ were difficult for the CI children, while the most accurate item was /k/ (84%). The CI group transmitted significantly reduced information, especially for place feature transmission, then the NH group (p 

    Matched MeSH terms: Phonetics
  9. Chong FY, Jenstad LM
    Med J Malaysia, 2018 12;73(6):365-370.
    PMID: 30647205
    INTRODUCTION: Modulation-based noise reduction (MBNR) is one of the common noise reduction methods used in hearing aids. Gain reduction in high frequency bands may occur for some implementations of MBNR and fricatives might be susceptible to alteration, given the high frequency components in fricative noise. The main objective of this study is to quantify the acoustic effect of MBNR on /s, z/.

    METHODS: Speech-and-noise signals were presented to, and recorded from, six hearing aids mounted on a head and torso simulator. Test stimuli were nonsense words mixed with pink, cafeteria, or speech-modulated noise at 0 dB SNR. Fricatives /s, z/ were extracted from the recordings for analysis.

    RESULTS: Analysis of the noise confirmed that MBNR in all hearing aids was activated for the recordings. More than 1.0 dB of acoustic change occurred to /s, z/ when MBNR was turned on in four out of the six hearing aids in the pink and cafeteria noise conditions. The acoustics of /s, z/ by female talkers were affected more than male talkers. Significant relationships between amount of noise reduction and acoustic change of /s, z/ were found. Amount of noise reduction accounts for 42.8% and 16.8% of the variability in acoustic change for /s/ and /z/ respectively.

    CONCLUSION: Some clinically-available implementations of MBNR have measurable effects on the acoustics of fricatives. Possible implications for speech perception are discussed.

    Matched MeSH terms: Phonetics
  10. Majid A, Roberts SG, Cilissen L, Emmorey K, Nicodemus B, O'Grady L, et al.
    Proc Natl Acad Sci U S A, 2018 Nov 06;115(45):11369-11376.
    PMID: 30397135 DOI: 10.1073/pnas.1720419115
    Is there a universal hierarchy of the senses, such that some senses (e.g., vision) are more accessible to consciousness and linguistic description than others (e.g., smell)? The long-standing presumption in Western thought has been that vision and audition are more objective than the other senses, serving as the basis of knowledge and understanding, whereas touch, taste, and smell are crude and of little value. This predicts that humans ought to be better at communicating about sight and hearing than the other senses, and decades of work based on English and related languages certainly suggests this is true. However, how well does this reflect the diversity of languages and communities worldwide? To test whether there is a universal hierarchy of the senses, stimuli from the five basic senses were used to elicit descriptions in 20 diverse languages, including 3 unrelated sign languages. We found that languages differ fundamentally in which sensory domains they linguistically code systematically, and how they do so. The tendency for better coding in some domains can be explained in part by cultural preoccupations. Although languages seem free to elaborate specific sensory domains, some general tendencies emerge: for example, with some exceptions, smell is poorly coded. The surprise is that, despite the gradual phylogenetic accumulation of the senses, and the imbalances in the neural tissue dedicated to them, no single hierarchy of the senses imposes itself upon language.
    Matched MeSH terms: Phonetics
  11. Valentini A, Ricketts J, Pye RE, Houston-Price C
    J Exp Child Psychol, 2018 03;167:10-31.
    PMID: 29154028 DOI: 10.1016/j.jecp.2017.09.022
    Reading and listening to stories fosters vocabulary development. Studies of single word learning suggest that new words are more likely to be learned when both their oral and written forms are provided, compared with when only one form is given. This study explored children's learning of phonological, orthographic, and semantic information about words encountered in a story context. A total of 71 children (8- and 9-year-olds) were exposed to a story containing novel words in one of three conditions: (a) listening, (b) reading, or (c) simultaneous listening and reading ("combined" condition). Half of the novel words were presented with a definition, and half were presented without a definition. Both phonological and orthographic learning were assessed through recognition tasks. Semantic learning was measured using three tasks assessing recognition of each word's category, subcategory, and definition. Phonological learning was observed in all conditions, showing that phonological recoding supported the acquisition of phonological forms when children were not exposed to phonology (the reading condition). In contrast, children showed orthographic learning of the novel words only when they were exposed to orthographic forms, indicating that exposure to phonological forms alone did not prompt the establishment of orthographic representations. Semantic learning was greater in the combined condition than in the listening and reading conditions. The presence of the definition was associated with better performance on the semantic subcategory and definition posttests but not on the phonological, orthographic, or category posttests. Findings are discussed in relation to the lexical quality hypothesis and the availability of attentional resources.
    Matched MeSH terms: Phonetics
  12. MARINA KAWI, DAYANG SARIAH ABANG SUHAI
    MyJurnal
    The study aims to identify an inventory of vowel phonemes of Melanau Rajang dialect in Belawai under the administration of Tanjung Manis District, Sarawak. This study is a field survey using interview methods to obtain data. A total 250 Swadesh list (Samarin, 1967) are used as a guide for data collection. In this study, two infotmants of different genders aged between 40 and 60 years old were selected based on criteria of informant selections according to Asmah Haji Omar (2001). In analysis data, qualitative method is used based on structural approaches. The findings show that there are eight (8) vowel phonemes have been identified; four (4) front vowels [i, e, ε, a]; one (1) central vowel [ə]; and three (3) back vowels [u, o, ɔ]. Besides that, the distribution/alternation of all vowel phonemes of Melanau Rajang dialect in Belawai are also discussed in this study. The findings also indicate that active vowel phonemes are vowels [a, i, u], while inactive vocal phonemes are vowels [ɔ, o, ε, ə, e].
    Matched MeSH terms: Phonetics
  13. Lim HW
    Clin Linguist Phon, 2018;32(10):889-912.
    PMID: 29993293 DOI: 10.1080/02699206.2018.1459852
    Child multilingual phonological errors are under-explored. Cross-linguistic studies suggest monolingual children make phonological errors that are subject to effects of language universality and ambient language characteristics. Bilingual Chinese children were observed to use not only typical, but also atypical phonological errors compared to monolingual peers acquiring similar languages. Atypical errors are a result of specific bilingual pair effects. Close-language-relatedness (Cantonese-Mandarin) is claimed to be responsible for the nonexistence of atypical errors in both languages, whilst distant-language-relatedness (Cantonese-English) is observed to cause atypical errors in both languages. The present novel cross-sectional study investigated phonological acquisition in three typologically distant languages: English-Mandarin-Malay by 64 multilingual Chinese children aged 2½-4½. The present research aimed to explore if multilingual Chinese children exhibit phonological errors which commensurate to that of monolingual and bilingual Chinese children acquiring similar languages as described in the literature. The single-word phonological test results revealed that the multilinguals exhibited typical and atypical phonological patterns which largely commensurate with the monolinguals and bilinguals. Similar to bilingual children, the multilingual children showed more atypical errors in English than in Mandarin, demonstrating effects of individual language irrespective of potential interaction with additional languages. The present result did not fully support the link between closeness in typology of languages and the absence of atypical errors. Rare atypical errors were found in Mandarin and Malay, two typologically different languages, and both were also interacting with English, another typologically different language. The present findings provided useful preliminary multilingual speech norms for the use of speech therapists.
    Matched MeSH terms: Phonetics*
  14. Leong CXR, Price JM, Pitchford NJ, van Heuven WJB
    PLoS One, 2018;13(10):e0204888.
    PMID: 30300372 DOI: 10.1371/journal.pone.0204888
    This paper evaluates a novel high variability phonetic training paradigm that involves presenting spoken words in adverse conditions. The effectiveness, generalizability, and longevity of this high variability phonetic training in adverse conditions was evaluated using English phoneme contrasts in three experiments with Malaysian multilinguals. Adverse conditions were created by presenting spoken words against background multi-talker babble. In Experiment 1, the adverse condition level was set at a fixed level throughout the training and in Experiment 2 the adverse condition level was determined for each participant before training using an adaptive staircase procedure. To explore the effectiveness and sustainability of the training, phonemic discrimination ability was assessed before and immediately after training (Experiments 1 and 2) and 6 months after training (Experiment 3). Generalization of training was evaluated within and across phonemic contrasts using trained and untrained stimuli. Results revealed significant perceptual improvements after just three 20-minute training sessions and these improvements were maintained after 6 months. The training benefits also generalized from trained to untrained stimuli. Crucially, perceptual improvements were significantly larger when the adverse conditions were adapted before each training session than when it was set at a fixed level. As the training improvements observed here are markedly larger than those reported in the literature, this indicates that the individualized phonetic training regime in adaptive adverse conditions (HVPT-AAC) is highly effective at improving speech perception.
    Matched MeSH terms: Phonetics
  15. Billings CJ, Grush LD, Maamor N
    Physiol Rep, 2017 Nov;5(20).
    PMID: 29051305 DOI: 10.14814/phy2.13464
    The effects of background noise on speech-evoked cortical auditory evoked potentials (CAEPs) can provide insight into the physiology of the auditory system. The purpose of this study was to determine background noise effects on neural coding of different phonemes within a syllable. CAEPs were recorded from 15 young normal-hearing adults in response to speech signals /s/, /ɑ/, and /sɑ/. Signals were presented at varying signal-to-noise ratios (SNRs). The effects of SNR and context (in isolation or within syllable) were analyzed for both phonemes. For all three stimuli, latencies generally decreased and amplitudes generally increased as SNR improved, and context effects were not present; however, the amplitude of the /ɑ/ response was the exception, showing no SNR effect and a significant context effect. Differential coding of /s/ and /ɑ/ likely result from level and timing differences. Neural refractoriness may result in the lack of a robust SNR effect on amplitude in the syllable context. The stable amplitude across SNRs in response to the vowel in /sɑ/ suggests the combined effects of (1) acoustic characteristics of the syllable and noise at poor SNRs and (2) refractory effects resulting from phoneme timing at good SNRs. Results provide insights into the coding of multiple-onset speech syllables in varying levels of background noise and, together with behavioral measures, may help to improve our understanding of speech-perception-in-noise difficulties.
    Matched MeSH terms: Phonetics*
  16. Phoon HS, Maclagan M, Abdullah AC
    Am J Speech Lang Pathol, 2015 Aug;24(3):517-32.
    PMID: 26125520 DOI: 10.1044/2015_AJSLP-14-0037
    This study investigated consonant cluster acquisition in Chinese-influenced Malaysian English (ChME)-speaking children.
    Matched MeSH terms: Phonetics*
  17. Lim HW, Wells B, Howard S
    Clin Linguist Phon, 2015;29(11):793-811.
    PMID: 26237032
    Early child multilingual acquisition is under-explored. Using a cross-sectional study approach, the present research investigates the rate of multilingual phonological acquisition of English-Mandarin-Malay by 64 ethnic Chinese children aged 2;06-4;05 in Malaysia--a multiracial-multilingual country of Asia. The aims of the study are to provide clinical norms for speech development in the multilingual children and to compare multilingual acquisition with monolingual and bilingual acquisition. An innovative multilingual phonological test which adopts well-defined scoring criteria drawing upon local accents of English, Mandarin and Malay is proposed and described in this article. This procedure has been neglected in the few existing Chinese bilingual phonological acquisition studies resulting in peculiar findings. The multilingual children show comparable phonological acquisition milestones to that of monolingual and bilingual peers acquiring the same languages. The implications of the present results are discussed. The present findings contribute to the development of models and theories of child multilingual acquisition.
    Matched MeSH terms: Phonetics*
  18. Muthusamy H, Polat K, Yaacob S
    PLoS One, 2015;10(3):e0120344.
    PMID: 25799141 DOI: 10.1371/journal.pone.0120344
    In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature.
    Matched MeSH terms: Phonetics*
  19. Phoon HS, Abdullah AC, Lee LW, Murugaiah P
    Clin Linguist Phon, 2014 May;28(5):329-45.
    PMID: 24446796 DOI: 10.3109/02699206.2013.868517
    To date, there has been little research done on phonological acquisition in the Malay language of typically developing Malay-speaking children. This study serves to fill this gap by providing a systematic description of Malay consonant acquisition in a large cohort of preschool-aged children between 4- and 6-years-old. In the study, 326 Malay-dominant speaking children were assessed using a picture naming task that elicited 53 single words containing all the primary consonants in Malay. Two main analyses were conducted to study their consonant acquisition: (1) age of customary and mastery production of consonants; and (2) consonant accuracy. Results revealed that Malay children acquired all the syllable-initial and syllable-final consonants before 4;06-years-old, with the exception of syllable-final /s/, /h/ and /l/ which were acquired after 5;06-years-old. The development of Malay consonants increased gradually from 4- to 6 years old, with female children performing better than male children. The accuracy of consonants based on manner of articulation showed that glides, affricates, nasals, and stops were higher than fricatives and liquids. In general, syllable-initial consonants were more accurate than syllable-final consonants while consonants in monosyllabic and disyllabic words were more accurate than polysyllabic words. These findings will provide significant information for speech-language pathologists for assessing Malay-speaking children and designing treatment objectives that reflect the course of phonological development in Malay.
    Matched MeSH terms: Phonetics*
  20. Mustafa MB, Salim SS, Mohamed N, Al-Qatab B, Siong CE
    PLoS One, 2014;9(1):e86285.
    PMID: 24466004 DOI: 10.1371/journal.pone.0086285
    Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data.
    Matched MeSH terms: Phonetics
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links