Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/110791
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electrical and Electronic Engineeringen_US
dc.creatorJin, Zen_US
dc.creatorTu, Yen_US
dc.creatorGan, CXen_US
dc.creatorMak, MWen_US
dc.creatorLee, KAen_US
dc.date.accessioned2025-02-04T07:11:10Z-
dc.date.available2025-02-04T07:11:10Z-
dc.identifier.issn0925-2312en_US
dc.identifier.urihttp://hdl.handle.net/10397/110791-
dc.language.isoenen_US
dc.publisherElsevier BVen_US
dc.rights© 2025 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).en_US
dc.rightsThe following publication Jin, Z., Tu, Y., Gan, C.-X., Mak, M.-W., & Lee, K.-A. (2025). Adversarially adaptive temperatures for decoupled knowledge distillation with applications to speaker verification. Neurocomputing, 624, 129481 is available at https://doi.org/10.1016/j.neucom.2025.129481.en_US
dc.subjectAdaptive temperatureen_US
dc.subjectAdversarial learningen_US
dc.subjectKnowledge distillationen_US
dc.subjectSpeaker verificationen_US
dc.titleAdversarially adaptive temperatures for decoupled knowledge distillation with applications to speaker verificationen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.volume624en_US
dc.identifier.doi10.1016/j.neucom.2025.129481en_US
dcterms.abstractKnowledge Distillation (KD) aims to transfer knowledge from a high-capacity teacher model to a lightweight student model, thereby enabling the student model to attain a level of performance that would be unattainable through conventional training methods. In conventional KD, the loss function’s temperature that controls the smoothness of class distributions is fixed. We argue that distribution smoothness is critical to the transfer of knowledge and propose an adversarial adaptive temperature module to set the temperature dynamically during training to enhance the student’s performance. Using the concept of decoupled knowledge distillation (DKD), we separate the Kullback–Leibler (KL) divergence into a target-class term and a non-target-class term. However, unlike DKD, we adversarially update the temperature coefficients of the target and non-target classes to maximize the distillation loss. We named our method Adversarially Adaptive Temperature for DKD (AAT-DKD). Our approach demonstrates improvements over KD methods across three test sets of Voxceleb1 for two student models (x-vector and ECAPA-TDNN). Specifically, compared to the traditional KD and DKD, our method achieves a remarkable reduction of 17.78% and 11.90% in EER using ECAPA-TDNN speaker embedding. Moreover, our method performs well on CN-Celeb and VoxSRC21, further highlighting its robustness and effectiveness across different datasets.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationNeurocomputing, 1 Apr. 2025, v. 624, 129481en_US
dcterms.isPartOfNeurocomputingen_US
dcterms.issued2025-04-01-
dc.identifier.scopus2-s2.0-85216017012-
dc.identifier.eissn1872-8286en_US
dc.identifier.artn129481en_US
dc.description.validate202502 bcchen_US
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumberOA_TA, a3641-
dc.identifier.SubFormID50553-
dc.description.fundingSourceRGCen_US
dc.description.pubStatusPublisheden_US
dc.description.TAElsevier (2025)en_US
dc.description.oaCategoryTAen_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
1-s2.0-S0925231225001535-main.pdf2.15 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

18
Citations as of Apr 14, 2025

Downloads

7
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

2
Citations as of Dec 19, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.