Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/112643
PIRA download icon_1.1View/Download Full Text
Title: Action progression networks for temporal action detection in videos
Authors: Lu, CK 
Mak, MW 
Li, RM
Chi, ZR 
Fu, H
Issue Date: 2024
Source: IEEE access, 2024, v. 12, p. 126829-126844
Abstract: This study introduces an innovative Temporal Action Detection (TAD) model that is distinguished by its lightweight structure and capability for end-to-end training, delivering competitive performance. Traditional TAD approaches often rely on pre-trained models for feature extraction, compromising on end-to-end training for efficiency, yet encounter challenges due to misalignment with tasks and data shifts. Our method addresses these challenges by processing untrimmed videos on a snippet basis, facilitating a snippet-level TAD model that is trained end-to-end. Central to our approach is a novel frame-level label, termed action progressions, designed to encode temporal localization information. The prediction of action progressions not only enables our snippet-level model to incorporate temporal information effectively but also introduces a granular temporal encoding for the evolution of actions, enhancing the precision of detection. Beyond a streamlined pipeline, our model introduces several novel capabilities: 1) It directly learns from raw videos, unlike prevalent TAD methods that depend on frozen, pre-trained feature extraction models; 2) It is flexible for training with trimmed and untrimmed videos; 3) It is the first TAD model to avoid the detection of incomplete actions; and 4) It can accurately detect long-lasting actions or those with clear evolutionary patterns. Utilizing these advantages, our model achieves commendable performance on benchmark datasets, securing averaged mean Average Precision (mAP) scores of 54.8%, 30.5%, and 78.7% on THUMOS14, ActivityNet-1.3, and DFMAD, respectively.
Keywords: Action recognition
Temporal action detection
Video analysis
Publisher: Institute of Electrical and Electronics Engineers
Journal: IEEE access 
EISSN: 2169-3536
DOI: 10.1109/ACCESS.2024.3451503
Rights: © 2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
The following publication C. -K. Lu, M. -W. Mak, R. Li, Z. Chi and H. Fu, "Action Progression Networks for Temporal Action Detection in Videos," in IEEE Access, vol. 12, pp. 126829-126844, 2024 is available at https://dx.doi.org/10.1109/ACCESS.2024.3451503.
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Lu_Action_Progression_Networks.pdf1.93 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

WEB OF SCIENCETM
Citations

1
Citations as of May 8, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.