Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/112643
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electrical and Electronic Engineering-
dc.creatorLu, CK-
dc.creatorMak, MW-
dc.creatorLi, RM-
dc.creatorChi, ZR-
dc.creatorFu, H-
dc.date.accessioned2025-04-24T00:28:17Z-
dc.date.available2025-04-24T00:28:17Z-
dc.identifier.urihttp://hdl.handle.net/10397/112643-
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.rights© 2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.en_US
dc.rightsFor more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/en_US
dc.rightsThe following publication C. -K. Lu, M. -W. Mak, R. Li, Z. Chi and H. Fu, "Action Progression Networks for Temporal Action Detection in Videos," in IEEE Access, vol. 12, pp. 126829-126844, 2024 is available at https://dx.doi.org/10.1109/ACCESS.2024.3451503.en_US
dc.subjectAction recognitionen_US
dc.subjectTemporal action detectionen_US
dc.subjectVideo analysisen_US
dc.titleAction progression networks for temporal action detection in videosen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage126829-
dc.identifier.epage126844-
dc.identifier.volume12-
dc.identifier.doi10.1109/ACCESS.2024.3451503-
dcterms.abstractThis study introduces an innovative Temporal Action Detection (TAD) model that is distinguished by its lightweight structure and capability for end-to-end training, delivering competitive performance. Traditional TAD approaches often rely on pre-trained models for feature extraction, compromising on end-to-end training for efficiency, yet encounter challenges due to misalignment with tasks and data shifts. Our method addresses these challenges by processing untrimmed videos on a snippet basis, facilitating a snippet-level TAD model that is trained end-to-end. Central to our approach is a novel frame-level label, termed action progressions, designed to encode temporal localization information. The prediction of action progressions not only enables our snippet-level model to incorporate temporal information effectively but also introduces a granular temporal encoding for the evolution of actions, enhancing the precision of detection. Beyond a streamlined pipeline, our model introduces several novel capabilities: 1) It directly learns from raw videos, unlike prevalent TAD methods that depend on frozen, pre-trained feature extraction models; 2) It is flexible for training with trimmed and untrimmed videos; 3) It is the first TAD model to avoid the detection of incomplete actions; and 4) It can accurately detect long-lasting actions or those with clear evolutionary patterns. Utilizing these advantages, our model achieves commendable performance on benchmark datasets, securing averaged mean Average Precision (mAP) scores of 54.8%, 30.5%, and 78.7% on THUMOS14, ActivityNet-1.3, and DFMAD, respectively.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationIEEE access, 2024, v. 12, p. 126829-126844-
dcterms.isPartOfIEEE access-
dcterms.issued2024-
dc.identifier.isiWOS:001316135100001-
dc.identifier.eissn2169-3536-
dc.description.validate202504 bcrc-
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumberOA_Scopus/WOSen_US
dc.description.fundingSourceRGCen_US
dc.description.fundingSourceOthersen_US
dc.description.fundingTextHong Kong Polytechnic Universityen_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryCCen_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Lu_Action_Progression_Networks.pdf1.93 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

WEB OF SCIENCETM
Citations

1
Citations as of May 8, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.