ConFusionformer : locality-enhanced conformer through multi-resolution attention fusion for speaker verification

Tu, Y; Mak, MW; Lee, KA; Lin, W

doi:10.1016/j.neucom.2025.130429

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/113412

Title:	ConFusionformer : locality-enhanced conformer through multi-resolution attention fusion for speaker verification
Authors:	Tu, Y Mak, MW Lee, KA Lin, W
Issue Date:	Sep-2025
Source:	Neurocomputing, 1 Sept 2025, v. 644, 130429
Abstract:	Conformers are capable of capturing both global and local dependencies in a sequence. Notably, the modeling of local information is critical to learning speaker characteristics. However, applying Conformers to speaker verification (SV) has not witnessed much success due to their inferior locality modeling capability and low computational efficiency. In this paper, we propose an improved Conformer, ConFusionformer, to address these two challenges. For increasing model efficiency, the conventional Conformer block is modified by placing one feed-forward network between a self-attention module and a convolution module. The modified Conformer block has fewer model parameters, thus reducing the computation cost. The modification also enables a deeper network, boosting the SV performance. Moreover, multi-resolution attention fusion is introduced into the self-attention mechanism to improve locality modeling. Specifically, the restored map from a low-resolution attention score map produced by downsampled queries and keys is fused with the original attention score map to exploit the local information within the restored local region. The proposed ConFusionformer is shown to outperform the Conformer for SV on VoxCeleb, CNCeleb, SRE21, and SRE24, demonstrating the superiority of the ConFusionformer in speaker modeling.
Keywords:	Conformer Multi-resolution attention fusion Speaker embedding Speaker verification Transformer
Publisher:	Elsevier BV
Journal:	Neurocomputing
ISSN:	0925-2312
EISSN:	1872-8286
DOI:	10.1016/j.neucom.2025.130429
Appears in Collections:	Journal/Magazine Article

Open Access Information

Status	embaroged access
Embargo End Date	2027-09-01

Access

View full-text via PolyU eLinks

Show full item record

Google Scholar^TM

Check

Open Access Information

Access

Google ScholarTM

Altmetric

Google Scholar^TM