Evaluating and enhancing the accuracy of automated fluency annotation tools in L2 research

Lu, J; Rogers, J

doi:10.1016/j.rmal.2026.100302

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118012

DC Field	Value	Language
dc.contributor	Department of English and Communication	-
dc.creator	Lu, J	-
dc.creator	Rogers, J	-
dc.date.accessioned	2026-03-12T01:02:48Z	-
dc.date.available	2026-03-12T01:02:48Z	-
dc.identifier.uri	http://hdl.handle.net/10397/118012	-
dc.language.iso	en	en_US
dc.publisher	Elsevier Ltd.	en_US
dc.rights	© 2026 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ ).	en_US
dc.rights	The following publication Lu, J., & Rogers, J. (2026). Evaluating and enhancing the accuracy of automated fluency annotation tools in L2 research. Research Methods in Applied Linguistics, 5(1), 100302 is available at https://doi.org/10.1016/j.rmal.2026.100302.	en_US
dc.subject	Automatic fluency assessment	en_US
dc.subject	Hybrid automated-manual pipeline	en_US
dc.subject	Second language speech	en_US
dc.subject	Temporal fluency features	en_US
dc.subject	Tool comparison	en_US
dc.title	Evaluating and enhancing the accuracy of automated fluency annotation tools in L2 research	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.volume	5	-
dc.identifier.issue	1	-
dc.identifier.doi	10.1016/j.rmal.2026.100302	-
dcterms.abstract	Fluency is a central dimension of L2 oral proficiency. Further, fluency assessment is important for many applied contexts, including pedagogical and assessment purposes. Yet, the measurement of fluency using manual annotation is labor-intensive, which limits its broad application and scalability. We evaluate two automated tools — an acoustic-based tool (de Jong et al., 2021) and a machine-learning tool (Matsuura et al., 2025) — using data from L1-Chinese learners of English. Accuracy was assessed for three metrics, articulation rate (AR), pause ratio (PR), and mean pause duration (MPD), via Pearson correlations with manual annotation. We compared two automated tools and tested whether targeted manual post-processing (TextGrid checks and transcript adjustments) improves metric extraction using Steiger’s test. Results from our sample indicated that de Jong et al. (2021) yielded higher accuracy for silence-based metrics (PR, MPD). However, text-dependent metrics (syllable number after removing disfluency words in AR) benefited from corrected TextGrids (for the acoustic tool) or corrected transcripts (for the machine-learning tool). These findings suggest a scalable division of labor: use an acoustic-based tool for silence-driven metrics, and apply corrected transcripts with a machine-learning tool when extracting text-sensitive metrics.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	Research methods in applied linguistics, Apr. 2026, v. 5, no. 1, 100302	-
dcterms.isPartOf	Research methods in applied linguistics	-
dcterms.issued	2026-04	-
dc.identifier.scopus	2-s2.0-105028960517	-
dc.identifier.eissn	2772-7661	-
dc.identifier.artn	100302	-
dc.description.validate	202603 bcch	-
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	OA_TA	en_US
dc.description.fundingSource	RGC	en_US
dc.description.pubStatus	Published	en_US
dc.description.TA	Elsevier (2026)	en_US
dc.description.oaCategory	TA	en_US
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
1-s2.0-S277276612600008X-main.pdf		1.63 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Google ScholarTM

Altmetric

Google Scholar^TM