Deep motion-appearance convolutions for robust visual tracking

Li, HJ; Wu, SH; Huang, SP; Lam, KM; Xing, XF

doi:10.1109/ACCESS.2019.2958405

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/82194

DC Field	Value	Language
dc.contributor	Department of Electronic and Information Engineering	-
dc.creator	Li, HJ	-
dc.creator	Wu, SH	-
dc.creator	Huang, SP	-
dc.creator	Lam, KM	-
dc.creator	Xing, XF	-
dc.date.accessioned	2020-05-05T05:59:03Z	-
dc.date.available	2020-05-05T05:59:03Z	-
dc.identifier.uri	http://hdl.handle.net/10397/82194	-
dc.language.iso	en	en_US
dc.publisher	Institute of Electrical and Electronics Engineers	en_US
dc.rights	This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/	en_US
dc.rights	The following publication H. Li, S. Wu, S. Huang, K. Lam and X. Xing, "Deep Motion-Appearance Convolutions for Robust Visual Tracking," in IEEE Access, vol. 7, pp. 180451-180466, 2019 is available at https://dx.doi.org/10.1109/ACCESS.2019.2958405	en_US
dc.subject	Visual tracking	en_US
dc.subject	3D convolutional kernels	en_US
dc.subject	Motion-appearance	en_US
dc.title	Deep motion-appearance convolutions for robust visual tracking	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.spage	180451	-
dc.identifier.epage	180466	-
dc.identifier.volume	7	-
dc.identifier.doi	10.1109/ACCESS.2019.2958405	-
dcterms.abstract	Visual tracking is a challenging task due to unconstrained appearance variations and dynamic surrounding backgrounds, which basically arise from the complex motion of the target object. Therefore, the information and the correlation between the target motion and its resulting appearance should be considered comprehensively to achieve robust tracking performance. In this paper, we propose a deep neural network for visual tracking, namely the Motion-Appearance Dual (MADual) network, which employs a dual-branch architecture, by using deep two-dimensional (2D) and deep three-dimensional (3D) convolutions to integrate the local and global information of the target object's motion and appearance synchronously. For each frame of a tracking video, 2D convolutional kernels of the deep 2D branch slide over the frame to extract its global spatial-appearance features. Meanwhile, 3D convolutional kernels of the deep 3D branch are used to collaboratively extract the appearance and the associated motion features of the visual target from successive frames. By sliding the 3D convolutional kernels along a video sequence, the model is able to learn the temporal features from previous frames, and therefore, generate the local patch-based motion patterns of the target. Sliding the 2D kernels on a frame and the 3D kernels on a frame cube synchronously enables a better hierarchical motion-appearance integration, and boosts the performance for the visual tracking task. To further improve the tracking precision, an extra ridge-regression model is trained for the tracking process, based not only on the bounding box given in the first frame, but also on its synchro-frame-cube using our proposed Inverse Temporal Training method (ITT). Extensive experiments on popular benchmark datasets, OTB2013, OTB50, OTB2015, UAV123, TC128, VOT2015 and VOT2016, demonstrate that the proposed MADual tracker performs favorably against many state-of-the-art methods.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	IEEE access, 9 Dec. 2019, v. 7, p. 180451-180466	-
dcterms.isPartOf	IEEE access	-
dcterms.issued	2019	-
dc.identifier.isi	WOS:000509483800250	-
dc.identifier.scopus	2-s2.0-85077215831	-
dc.identifier.eissn	2169-3536	-
dc.description.validate	202006 bcrc	-
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	OA_Scopus/WOS	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	CC	en_US
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Li_Motion-Appearance_Convolutions_Visual.pdf		2.14 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

Page views

208

Last Week
47

Last month

Citations as of Feb 9, 2026

Downloads

186

Citations as of Feb 9, 2026

SCOPUS^TM
Citations

2

Citations as of May 8, 2026

WEB OF SCIENCE^TM
Citations

2

Citations as of Apr 23, 2026

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM