Please use this identifier to cite or link to this item:
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electronic and Information Engineeringen_US
dc.creatorJiang, Yen_US
dc.creatorLeung, FHen_US
dc.identifier.isbn978-1-5386-6811-5 (Electronic)en_US
dc.identifier.isbn978-1-5386-6810-8 (USB)en_US
dc.identifier.isbn978-1-5386-6812-2 (Print on Demand(PoD))en_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.rights© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_US
dc.rightsThe following publication Y. Jiang and F. H. F. Leung, "Comparison of Supervector and Majority Voting in Acoustic Scene Identification," 2018 IEEE 23rd International Conference on Digital Signal Processing (DSP), Shanghai, China, 2018, pp. 1-5 is available at
dc.subjectAcoustic scene identificationen_US
dc.subjectMajority votingen_US
dc.subjectGaussian supervectoren_US
dc.subjectFactor analysis supervectoren_US
dc.titleComparison of supervector and majority voting in acoustic scene identificationen_US
dc.typeConference Paperen_US
dcterms.abstractAcoustic scene identification aims to identify the acoustic environment from the acoustic signal. Usually one first divides a piece of acoustic signal into multiple short-time frames and then calculates frame-level features. A natural question is then how to make use of these frame-level features for identification purposes. In this paper, we compare two feature aggregation methods. One method is Majority Voting (MV), which treats each frame-level feature as an independent feature vector and then perform identification using majority voting strategies. In this way, an acoustic signal is represented by multiple feature vectors. The other method is Supervector, which maps the frame-level features to a single feature vector. In this way, an acoustic signal is represented by one feature vector. Particularly, we consider three types of Supervector, which are Gaussian Supervector, Factor Analysis Supervector, and i-vector. We then compare Supervector with MV in an acoustic identification task. Different classifiers are employed, including Gaussian Mixture Model (GMM), Support Vector Machine (SVM), Multilayer Perceptron (MLP), and Deep Neural Network (DNN). Experimental results indicate that these two feature aggregation methods give very similar performances, nonetheless, each has its own advantages and disadvantages.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitation2018 IEEE 23rd International Conference on Digital Signal Processing (DSP), Shanghai, China, China, 19-21 Nov. 2018, p. 1-5en_US
dc.relation.conferenceIEEE International Conference on Digital Signal Processing [DSP])en_US
dc.description.validate202011 bcrcen_US
dc.description.oaAccepted Manuscripten_US
Appears in Collections:Conference Paper
Files in This Item:
File Description SizeFormat 
Jiang_Comparison_Supervector_Majority.pdfPre-Published version998.2 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

Citations as of Jun 19, 2022


Citations as of Jun 19, 2022

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.