An audio data representation for traffic acoustic scene recognition

Jiang, DZ; Huang, DM; Song, YY; Wu, KC; Lu, HK; Liu, QQ; Zhou, T

doi:10.1109/ACCESS.2020.3027474

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/88789

Title:	An audio data representation for traffic acoustic scene recognition
Authors:	Jiang, DZ Huang, DM Song, YY Wu, KC Lu, HK Liu, QQ Zhou, T
Issue Date:	2020
Source:	IEEE access, . . 2020, , v. 8, p. 177863-177873
Abstract:	Acoustic scene recognition (ASR), recognizing acoustic environments given an audio recording of the scene, has a wide range of applications, e.g. robotic navigation and audio forensic. However, ASR remains challenging mainly due to the difficulty of representing audio data. In this article, we focus on traffic acoustic data. Traffic acoustic sense recognition provides complementary information to visual information of the scene; for example, it can be used to verify the visual perception result. The acoustic analysis and recognition, in consideration of its simple and convenient, can effectively enhance the perception ability which only applies visual information. We propose an audio data representation method to improve the traffic acoustic scene recognition accuracy. The proposed method employs the constant Q transform (CQT) and histogram of gradient (HOG) to transfer the one-dimensional audio signals into a time-frequency representation. We also propose two data representation mechanisms, called global and local feature selections, in order to select features that are able to describe the shape of time-frequency structures. We finally exploit the least absolute shrinkage and selection operator (LASSO) technique to further improve the recognition accuracy, by further selecting the most representative information for the recognition. We implemented extensive experiments, and the results show that the proposed method is effective, significantly outperforming the state-of-the-art methods.
Keywords:	Acoustics Feature extraction Spectrogram Transforms Histograms Time-Frequency analysis Visualization Feature extraction Acoustic scene recognition Transportation Acoustic material
Publisher:	Institute of Electrical and Electronics Engineers
Journal:	IEEE access
EISSN:	2169-3536
DOI:	10.1109/ACCESS.2020.3027474
Rights:	This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ The following publication Jiang, D. Z., Huang, D. M., Song, Y. Y., Wu, K. C., Lu, H. K., Liu, Q. Q., & Zhou, T. (2020). An audio data representation for traffic acoustic scene recognition. IEEE Access, 8, 177863-177873 is available at https://dx.doi.org/10.1109/ACCESS.2020.3027474
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Jiang_Audio_Data_Representation.pdf		1.64 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show full item record

Page views

102

Last Week
4

Last month

Citations as of Mar 15, 2026

Downloads

156

Citations as of Mar 15, 2026

SCOPUS^TM
Citations

11

Citations as of Jun 21, 2024

WEB OF SCIENCE^TM
Citations

9

Citations as of Apr 23, 2026

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM