Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/112643
Title: | Action progression networks for temporal action detection in videos | Authors: | Lu, CK Mak, MW Li, RM Chi, ZR Fu, H |
Issue Date: | 2024 | Source: | IEEE access, 2024, v. 12, p. 126829-126844 | Abstract: | This study introduces an innovative Temporal Action Detection (TAD) model that is distinguished by its lightweight structure and capability for end-to-end training, delivering competitive performance. Traditional TAD approaches often rely on pre-trained models for feature extraction, compromising on end-to-end training for efficiency, yet encounter challenges due to misalignment with tasks and data shifts. Our method addresses these challenges by processing untrimmed videos on a snippet basis, facilitating a snippet-level TAD model that is trained end-to-end. Central to our approach is a novel frame-level label, termed action progressions, designed to encode temporal localization information. The prediction of action progressions not only enables our snippet-level model to incorporate temporal information effectively but also introduces a granular temporal encoding for the evolution of actions, enhancing the precision of detection. Beyond a streamlined pipeline, our model introduces several novel capabilities: 1) It directly learns from raw videos, unlike prevalent TAD methods that depend on frozen, pre-trained feature extraction models; 2) It is flexible for training with trimmed and untrimmed videos; 3) It is the first TAD model to avoid the detection of incomplete actions; and 4) It can accurately detect long-lasting actions or those with clear evolutionary patterns. Utilizing these advantages, our model achieves commendable performance on benchmark datasets, securing averaged mean Average Precision (mAP) scores of 54.8%, 30.5%, and 78.7% on THUMOS14, ActivityNet-1.3, and DFMAD, respectively. | Keywords: | Action recognition Temporal action detection Video analysis |
Publisher: | Institute of Electrical and Electronics Engineers | Journal: | IEEE access | EISSN: | 2169-3536 | DOI: | 10.1109/ACCESS.2024.3451503 | Rights: | © 2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ The following publication C. -K. Lu, M. -W. Mak, R. Li, Z. Chi and H. Fu, "Action Progression Networks for Temporal Action Detection in Videos," in IEEE Access, vol. 12, pp. 126829-126844, 2024 is available at https://dx.doi.org/10.1109/ACCESS.2024.3451503. |
Appears in Collections: | Journal/Magazine Article |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Lu_Action_Progression_Networks.pdf | 1.93 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.