Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/102164
| DC Field | Value | Language |
|---|---|---|
| dc.contributor | Department of Electrical and Electronic Engineering | en_US |
| dc.creator | Zuo, L | en_US |
| dc.creator | Mak, MW | en_US |
| dc.date.accessioned | 2023-10-11T01:57:56Z | - |
| dc.date.available | 2023-10-11T01:57:56Z | - |
| dc.identifier.issn | 0167-8655 | en_US |
| dc.identifier.uri | http://hdl.handle.net/10397/102164 | - |
| dc.language.iso | en | en_US |
| dc.publisher | Elsevier | en_US |
| dc.rights | © 2023 Elsevier B.V. All rights reserved. | en_US |
| dc.rights | © 2023. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/ | en_US |
| dc.rights | The following publication Zuo, L., & Mak, M.-W. (2023). Avoiding dominance of speaker features in speech-based depression detection. Pattern Recognition Letters, 173, 50–56 is available at https://doi.org/10.1016/j.patrec.2023.07.016. | en_US |
| dc.subject | Depression detection | en_US |
| dc.subject | Feature disentanglement | en_US |
| dc.subject | Speaker embedding | en_US |
| dc.subject | Speaker invariance | en_US |
| dc.title | Avoiding dominance of speaker features in speech-based depression detection | en_US |
| dc.type | Journal/Magazine Article | en_US |
| dc.identifier.spage | 50 | en_US |
| dc.identifier.epage | 56 | en_US |
| dc.identifier.volume | 173 | en_US |
| dc.identifier.doi | 10.1016/j.patrec.2023.07.016 | en_US |
| dcterms.abstract | The performance of speech-based depression detectors is limited by the scarcity and imbalance in depression data. We found that depression detectors could be strongly biased toward speaker features when the number of training speakers is insufficient. To address this issue, we propose a speaker-invariant depression detector (SIDD) that minimizes speaker information in the latent space. The SIDD consists of an autoencoder, a depression classifier, and a speaker-embedding projector. By incorporating speaker-embedding vectors into the autoencoder’s latent vectors, speaker information is effectively eliminated for the depression classifier. Experimental results demonstrate significant improvements achieved by minimizing speaker information, and our proposed method generally outperforms previous approaches for depression detection on the DAIC-WOZ dataset. | en_US |
| dcterms.accessRights | open access | en_US |
| dcterms.bibliographicCitation | Pattern recognition letters, Sept 2023, v. 173, p. 50-56 | en_US |
| dcterms.isPartOf | Pattern recognition letters | en_US |
| dcterms.issued | 2023-09 | - |
| dc.identifier.eissn | 1872-7344 | en_US |
| dc.description.validate | 202310 bcch | en_US |
| dc.description.oa | Accepted Manuscript | en_US |
| dc.identifier.FolderNumber | a2475 | - |
| dc.identifier.SubFormID | 47754 | - |
| dc.description.fundingSource | Others | en_US |
| dc.description.fundingText | National Natural Science Foundation of China | en_US |
| dc.description.pubStatus | Published | en_US |
| dc.description.oaCategory | Green (AAM) | en_US |
| Appears in Collections: | Journal/Magazine Article | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| Zuo_Avoiding_Dominance_Speaker.pdf | Pre-Published version | 2.02 MB | Adobe PDF | View/Open |
Page views
130
Last Week
4
4
Last month
Citations as of Nov 9, 2025
SCOPUSTM
Citations
3
Citations as of Jun 21, 2024
WEB OF SCIENCETM
Citations
10
Citations as of Dec 4, 2025
Google ScholarTM
Check
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



