Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118082
PIRA download icon_1.1View/Download Full Text
Title: Uncertainty estimation for sound source localization with deep learning
Authors: Pi, R 
Yu, X 
Issue Date: 2025
Source: IEEE transactions on instrumentation and measurement, 2025, v. 74, 2502512
Abstract: While significant progress has been made in the field of sound source localization (SSL), the confidence and robustness of the localization results still remain low. Conducting uncertainty analysis can effectively alleviate this problem since it provides a measure of the confidence level in the SSL results. In this work, we propose a novel framework for SSL that not only delivers the state-of-the-art localization performance, but also provides reliable uncertainty estimations. Our framework leverages a novel backbone architecture integrating a multihead self-attention module to effectively capture spatial features through a self-attention mechanism. Additionally, our approach incorporates subjective theory to associate predictions obtained from the neural network with a Dirichlet distribution. This allows us to model the overall uncertainty by parameterizing the class probabilities of the positions of the sound source. To comprehensively evaluate the performance of the proposed method, extensive experiments were conducted using both simulated and real-world datasets. The results show that the proposed method can improve the SSL accuracy (ACC) and enhance the neural network's reliability, even out-of-distribution samples can be handled effectively. The obtained accurate sound source positions and uncertainty estimations can be utilized in downstream audio-related tasks, such as enhancing the ACC and reliability of sound event detection by incorporating uncertainty. This integration can assist robots in making more informed decisions by fusing information from multiple sources. Our code is available at https://github.com/Devin-Pi/uncertainty-estimation-for-ssl.
Keywords: Attention mechanism
Deep learning
Moving sound source localization (SSL)
Subjective logic (SL) theory
Uncertainty estimation
Publisher: Institute of Electrical and Electronics Engineers
Journal: IEEE transactions on instrumentation and measurement 
ISSN: 0018-9456
EISSN: 1557-9662
DOI: 10.1109/TIM.2024.3522632
Rights: © 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
The following publication R. Pi and X. Yu, 'Uncertainty Estimation for Sound Source Localization With Deep Learning,' in IEEE Transactions on Instrumentation and Measurement, vol. 74, pp. 1-12, 2025, Art no. 2502512 is available at https://doi.org/10.1109/TIM.2024.3522632.
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Pi_Uncertainty_Estimation_Sound.pdfPre-Published version7.49 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.