Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/92350
PIRA download icon_1.1View/Download Full Text
Title: Predicting gender and age categories in English conversations using lexical, non-lexical, and turn-taking features
Authors: Liesenfeld, A 
Parti, G 
Hsu, YY 
Huang, CR 
Issue Date: Oct-2020
Source: In ML Nguyen, MC Luong & S Song (Eds.), Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation, 24-26 October, 2020, University of Science, Vietnam National University Hanoi, Vietnam, p. 157-166. Association for Computational Linguistics, 2020
Abstract: This paper examines gender and age salience and (stereo)typicality in British English talk with the aim to predict gender and age categories based on lexical, phrasal and turntaking features. We examine the SpokenBNC, a corpus of around 11.4 million words of British English conversations and identify behavioural differences between speakers that are labelled for gender and age categories. We explore differences in language use and turn-taking dynamics and identify a range of characteristics that set the categories apart. We find that female speakers tend to produce more and slightly longer turns, while turns by male speakers feature a higher type-token ratio and a distinct range of minimal particles such as “eh”, “uh” and “em”. Across age groups, we observe, for instance, that swear words and laughter characterize young speakers’ talk, while old speakers tend to produce more truncated words. We then use the observed characteristics to predict gender and age labels of speakers per conversation and per turn as a classification task, showing that non-lexical utterances such as minimal particles that are usually left out of dialog data can contribute to setting the categories apart.
Publisher: Association for Computational Linguistics
Description: 34th Pacific Asia Conference on Language, Information and Computation, Oct. 2020, Hanoi, Vietnam
Rights: Copyright of contributed papers reserved by respective authors.
Posted with permission of the author.
The following publication Andreas Liesenfeld, Gábor Parti, Yuyin Hsu, and Chu-Ren Huang. 2020. Predicting gender and age categories in English conversations using lexical, non-lexical, and turn-taking features. In Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation, pages 157–166, Hanoi, Vietnam. Association for Computational Linguistics is available at https://aclanthology.org/2020.paclic-1.19/.
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
2020.paclic-1.19.pdf730.6 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

59
Last Week
0
Last month
Citations as of Apr 28, 2024

Downloads

13
Citations as of Apr 28, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.