Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/75830
Title: GMAT : glottal closure instants detection based on the multiresolution absolute teager-kaiser energy operator
Authors: Wu, KB
Zhang, D 
Lu, GM
Keywords: Glottal closure instants detection
Multiresolution
Pooling techniques
Teager-Kaiser Energy Operator (TKEO)
Issue Date: 2017
Publisher: Academic Press
Source: Digital signal processing : a review journal, 2017, v. 69, p. 286-299 How to cite?
Journal: Digital signal processing : a review journal 
Abstract: Glottal Closure Instants (GCls) detection is important to many speech applications. However, most existing algorithms cannot achieve computational efficiency and accuracy simultaneously. In this paper, we present the Glottal closure instants detection based on the Multiresolution Absolute TKEO (GMAT) that can detect GCIs with high accuracy and low computational cost. Considering the nonlinearity in speech production, the Teager-Kaiser Energy Operator (TKEO) is utilized to detect GCIs and an instant with a high absolute TKEO value often indicates a GCI. To enhance robustness, three multiscale pooling techniques, which are max pooling, multiscale product, and mean pooling, are applied to fuse absolute TKEOs of several scales. Finally, GCIs are detected based on the fused results. In the performance evaluation, GMAT is compared with three state-of-the-art methods, MSM (Most Singular Manifold-based approach), ZFR (Zero Frequency Resonator-based method), and SEDREAMS (Speech Event Detection using the Residual Excitation And a Mean-based Signal). On clean speech, experiments show that GMAT can attain higher identification rate and accuracy than MSM. Comparing with ZFR and SEDREAMS, GMAT gives almost the same reliability and higher accuracy. In addition, on noisy speech, GMAT demonstrates the highest robustness for most SNR levels. Additional comparison shows that GMAT is less sensitive to the choice of scale in multiscale processing and it has low computational cost. Finally, pathological speech identification, which is a concrete application of GCIs, is included to show the efficacy of GMAT in practice. Through this paper, we investigate the potential of TKEO for GCI detection and the proposed algorithm GMAT can detect GCIs with high accuracy and low computational cost. Due to the superiority of GMAT, it will be a promising choice for GCI detection, particularly in real-time scenarios. Hence, this work may contribute to systems relying on GCIs, where both accuracy and computational cost are crucial.
URI: http://hdl.handle.net/10397/75830
ISSN: 1051-2004
EISSN: 1095-4333
DOI: 10.1016/j.dsp.2017.07.006
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

1
Citations as of May 11, 2018

Page view(s)

3
Citations as of May 21, 2018

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.