Displaying all 4 publications

Abstract:
Sort:
  1. Hossain ME, Jassim WA, Zilany MS
    PLoS One, 2016;11(3):e0150415.
    PMID: 26967160 DOI: 10.1371/journal.pone.0150415
    Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants.
  2. Islam MA, Jassim WA, Cheok NS, Zilany MS
    PLoS One, 2016;11(7):e0158520.
    PMID: 27392046 DOI: 10.1371/journal.pone.0158520
    Speaker identification under noisy conditions is one of the challenging topics in the field of speech processing applications. Motivated by the fact that the neural responses are robust against noise, this paper proposes a new speaker identification system using 2-D neurograms constructed from the responses of a physiologically-based computational model of the auditory periphery. The responses of auditory-nerve fibers for a wide range of characteristic frequency were simulated to speech signals to construct neurograms. The neurogram coefficients were trained using the well-known Gaussian mixture model-universal background model classification technique to generate an identity model for each speaker. In this study, three text-independent and one text-dependent speaker databases were employed to test the identification performance of the proposed method. Also, the robustness of the proposed method was investigated using speech signals distorted by three types of noise such as the white Gaussian, pink, and street noises with different signal-to-noise ratios. The identification results of the proposed neural-response-based method were compared to the performances of the traditional speaker identification methods using features such as the Mel-frequency cepstral coefficients, Gamma-tone frequency cepstral coefficients and frequency domain linear prediction. Although the classification accuracy achieved by the proposed method was comparable to the performance of those traditional techniques in quiet, the new feature was found to provide lower error rates of classification under noisy environments.
  3. Abdulhussain SH, Ramli AR, Saripan MI, Mahmmod BM, Al-Haddad SAR, Jassim WA
    Entropy (Basel), 2018 Mar 23;20(4).
    PMID: 33265305 DOI: 10.3390/e20040214
    The recent increase in the number of videos available in cyberspace is due to the availability of multimedia devices, highly developed communication technologies, and low-cost storage devices. These videos are simply stored in databases through text annotation. Content-based video browsing and retrieval are inefficient due to the method used to store videos in databases. Video databases are large in size and contain voluminous information, and these characteristics emphasize the need for automated video structure analyses. Shot boundary detection (SBD) is considered a substantial process of video browsing and retrieval. SBD aims to detect transition and their boundaries between consecutive shots; hence, shots with rich information are used in the content-based video indexing and retrieval. This paper presents a review of an extensive set for SBD approaches and their development. The advantages and disadvantages of each approach are comprehensively explored. The developed algorithms are discussed, and challenges and recommendations are presented.
  4. Lim PK, Ng SC, Jassim WA, Redmond SJ, Zilany M, Avolio A, et al.
    Sensors (Basel), 2015 Jun 16;15(6):14142-61.
    PMID: 26087370 DOI: 10.3390/s150614142
    We present a novel approach to improve the estimation of systolic (SBP) and diastolic blood pressure (DBP) from oscillometric waveform data using variable characteristic ratios between SBP and DBP with mean arterial pressure (MAP). This was verified in 25 healthy subjects, aged 28 ± 5 years. The multiple linear regression (MLR) and support vector regression (SVR) models were used to examine the relationship between the SBP and the DBP ratio with ten features extracted from the oscillometric waveform envelope (OWE). An automatic algorithm based on relative changes in the cuff pressure and neighbouring oscillometric pulses was proposed to remove outlier points caused by movement artifacts. Substantial reduction in the mean and standard deviation of the blood pressure estimation errors were obtained upon artifact removal. Using the sequential forward floating selection (SFFS) approach, we were able to achieve a significant reduction in the mean and standard deviation of differences between the estimated SBP values and the reference scoring (MLR: mean ± SD = -0.3 ± 5.8 mmHg; SVR and -0.6 ± 5.4 mmHg) with only two features, i.e., Ratio2 and Area3, as compared to the conventional maximum amplitude algorithm (MAA) method (mean ± SD = -1.6 ± 8.6 mmHg). Comparing the performance of both MLR and SVR models, our results showed that the MLR model was able to achieve comparable performance to that of the SVR model despite its simplicity.
Related Terms
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links