Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/113412
Title: ConFusionformer : locality-enhanced conformer through multi-resolution attention fusion for speaker verification
Authors: Tu, Y 
Mak, MW 
Lee, KA 
Lin, W 
Issue Date: Sep-2025
Source: Neurocomputing, 1 Sept 2025, v. 644, 130429
Abstract: Conformers are capable of capturing both global and local dependencies in a sequence. Notably, the modeling of local information is critical to learning speaker characteristics. However, applying Conformers to speaker verification (SV) has not witnessed much success due to their inferior locality modeling capability and low computational efficiency. In this paper, we propose an improved Conformer, ConFusionformer, to address these two challenges. For increasing model efficiency, the conventional Conformer block is modified by placing one feed-forward network between a self-attention module and a convolution module. The modified Conformer block has fewer model parameters, thus reducing the computation cost. The modification also enables a deeper network, boosting the SV performance. Moreover, multi-resolution attention fusion is introduced into the self-attention mechanism to improve locality modeling. Specifically, the restored map from a low-resolution attention score map produced by downsampled queries and keys is fused with the original attention score map to exploit the local information within the restored local region. The proposed ConFusionformer is shown to outperform the Conformer for SV on VoxCeleb, CNCeleb, SRE21, and SRE24, demonstrating the superiority of the ConFusionformer in speaker modeling.
Keywords: Conformer
Multi-resolution attention fusion
Speaker embedding
Speaker verification
Transformer
Publisher: Elsevier BV
Journal: Neurocomputing 
ISSN: 0925-2312
EISSN: 1872-8286
DOI: 10.1016/j.neucom.2025.130429
Appears in Collections:Journal/Magazine Article

Open Access Information
Status embaroged access
Embargo End Date 2027-09-01
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.