MyMedR

Displaying all 3 publications

Abstract:

Sort:

Fulltext Particle swarm optimization based feature enhancement and feature selection for improved emotion recognition in speech and glottal signals

Muthusamy H, Polat K, Yaacob S

PLoS One, 2015;10(3):e0120344.
PMID: 25799141 DOI: 10.1371/journal.pone.0120344

In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature.

Matched MeSH terms: Speech Recognition Software*
Fulltext Severity-based adaptation with limited data for ASR to aid dysarthric speakers

Mustafa MB, Salim SS, Mohamed N, Al-Qatab B, Siong CE

PLoS One, 2014;9(1):e86285.
PMID: 24466004 DOI: 10.1371/journal.pone.0086285

Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data.

Matched MeSH terms: Speech Recognition Software*
Science stars of East Asia

Cyranoski D, Law YH, Ong S, Phillips N, Zastrow M

Nature, 2018 06;558(7711):502-510.
PMID: 29950631 DOI: 10.1038/d41586-018-05506-1
Matched MeSH terms: Speech Recognition Software*

Filters

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links