Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/106861
DC FieldValueLanguage
dc.contributorDepartment of Electrical and Electronic Engineering-
dc.creatorLu, Cen_US
dc.creatorMak, MWen_US
dc.date.accessioned2024-06-06T06:06:01Z-
dc.date.available2024-06-06T06:06:01Z-
dc.identifier.issn0925-2312en_US
dc.identifier.urihttp://hdl.handle.net/10397/106861-
dc.language.isoenen_US
dc.publisherElsevier BVen_US
dc.subjectAction recognitionen_US
dc.subjectIntelligent video systemen_US
dc.subjectTemporal action detectionen_US
dc.titleDITA : DETR with improved queries for end-to-end temporal action detectionen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.doi10.1016/j.neucom.2024.127914en_US
dcterms.abstractThe DEtection TRansformer (DETR), with its elegant architecture and set prediction, has revolutionized object detection. However, DETR-like models have yet to achieve comparable success in temporal action detection (TAD). To address this gap, we introduce a series of improvements to the original DETR, proposing a new DETR-based model for TAD that achieves competitive performance relative to conventional TAD methods. Specifically, we adapt advanced techniques from DETR variants used in object detection, including deformable attention, denoising training, and selective query recollection. Furthermore, we propose several new techniques aimed at enhancing detection precision and model convergence speed, such as geographic query grouping and learnable proposals. Leveraging these innovations, we introduce a new model called DETR with Improved queries for Temporal Action Detection (DITA). DITA not only adheres to DETR’s elegant design philosophy but is also competitive to state-of-the-art action detection models. Remarkably, it is the first TAD model to achieve an mAP over 70% on THUMOS14, outperforming the previous best DETR variant by 13.5 percentage points.-
dcterms.accessRightsembargoed accessen_US
dcterms.bibliographicCitationNeurocomputing. Available online 28 May 2024, In Press, Journal Pre-proof, 127914, https://doi.org/10.1016/j.neucom.2024.127914en_US
dcterms.isPartOfNeurocomputingen_US
dcterms.issued2024-
dc.identifier.eissn1872-8286en_US
dc.identifier.artn127914en_US
dc.description.validate202406 bcch-
dc.identifier.FolderNumbera2778-
dc.identifier.SubFormID48310-
dc.description.fundingSourceSelf-fundeden_US
dc.description.pubStatusEarly releaseen_US
dc.date.embargo0000-00-00 (to be updated)en_US
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Journal/Magazine Article
Open Access Information
Status embargoed access
Embargo End Date 0000-00-00 (to be updated)
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

4
Citations as of Jun 30, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.