Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/95326
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electronic and Information Engineeringen_US
dc.creatorYi, Len_US
dc.creatorMak, MWen_US
dc.date.accessioned2022-09-19T01:59:41Z-
dc.date.available2022-09-19T01:59:41Z-
dc.identifier.issn2162-237Xen_US
dc.identifier.urihttp://hdl.handle.net/10397/95326-
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.rights© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_US
dc.rightsThe following publication L. Yi and M. -W. Mak, "Improving Speech Emotion Recognition With Adversarial Data Augmentation Network," in IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 1, pp. 172-184, Jan. 2022 is available at https://doi.org/10.1109/TNNLS.2020.3027600.en_US
dc.subjectData augmentationen_US
dc.subjectGenerative adversarial networks (GANs)en_US
dc.subjectSpeech emotion recognitionen_US
dc.subjectWasserstein divergenceen_US
dc.titleImproving speech emotion recognition with adversarial data augmentation networken_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage172en_US
dc.identifier.epage184en_US
dc.identifier.volume33en_US
dc.identifier.issue1en_US
dc.identifier.doi10.1109/TNNLS.2020.3027600en_US
dcterms.abstractWhen training data are scarce, it is challenging to train a deep neural network without causing the overfitting problem. For overcoming this challenge, this article proposes a new data augmentation network-namely adversarial data augmentation network (ADAN)-based on generative adversarial networks (GANs). The ADAN consists of a GAN, an autoencoder, and an auxiliary classifier. These networks are trained adversarially to synthesize class-dependent feature vectors in both the latent space and the original feature space, which can be augmented to the real training data for training classifiers. Instead of using the conventional cross-entropy loss for adversarial training, the Wasserstein divergence is used in an attempt to produce high-quality synthetic samples. The proposed networks were applied to speech emotion recognition using EmoDB and IEMOCAP as the evaluation data sets. It was found that by forcing the synthetic latent vectors and the real latent vectors to share a common representation, the gradient vanishing problem can be largely alleviated. Also, results show that the augmented data generated by the proposed networks are rich in emotion information. Thus, the resulting emotion classifiers are competitive with state-of-The-Art speech emotion recognition systems.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationIEEE transactions on neural networks and learning systems, Jan. 2022, v. 33, no. 1, p. 172-184en_US
dcterms.isPartOfIEEE transactions on neural networks and learning systemsen_US
dcterms.issued2022-01-
dc.identifier.scopus2-s2.0-85116598107-
dc.identifier.pmid33035171-
dc.identifier.eissn2162-2388en_US
dc.description.validate202209 bcvcen_US
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumberRGC-B2-0262, a1720-
dc.identifier.SubFormID45835-
dc.description.fundingSourceRGCen_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
T-NNLS-emotion.pdfPre-Published version10.19 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

96
Last Week
0
Last month
Citations as of Apr 14, 2025

Downloads

464
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

93
Citations as of Dec 19, 2025

WEB OF SCIENCETM
Citations

82
Citations as of Dec 18, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.