Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/106607
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Mechanical Engineeringen_US
dc.creatorZhang, Sen_US
dc.creatorZhang, Den_US
dc.creatorZou, Qen_US
dc.date.accessioned2024-05-14T05:42:05Z-
dc.date.available2024-05-14T05:42:05Z-
dc.identifier.issn1380-7501en_US
dc.identifier.urihttp://hdl.handle.net/10397/106607-
dc.language.isoenen_US
dc.publisherSpringeren_US
dc.rights© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024en_US
dc.rightsThis version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use (https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/s11042-024-19002-4.en_US
dc.subjectVisual object trackingen_US
dc.subjectGlobal-local representation aggregationen_US
dc.subjectChannel informationen_US
dc.subjectTransformer attentionen_US
dc.subjectConvolutionen_US
dc.titleTGLC : visual object tracking by fusion of global-local information and channel informationen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage89151en_US
dc.identifier.epage89172en_US
dc.identifier.volume83en_US
dc.identifier.issue41en_US
dc.identifier.doi10.1007/s11042-024-19002-4en_US
dcterms.abstractVisual object tracking aspires to locate the target incessantly in each frame with designated initial target location, which is an imperative yet demanding task in computer vision. Recent approaches strive to fuse global information of template and search region for object tracking, which achieve promising tracking performance. However, fusion of global information devastates some local details. Local information is essential for distinguishing the target from background regions. With a focus on addressing this problem, this work presents a novel tracking algorithm TGLC integrating a channel-aware convolution block and Transformer attention for global and local representation aggregation, and for channel information modeling. This method is capable of accurately estimating the bounding box of the target. Extensive experiments are conducted on five widely recognized datasets, i.e., GOT-10k, TrackingNet, LaSOT, OTB100 and UAV123. The results depict that the proposed tracking method achieves competitive tracking performance compared with state-of-the-art trackers while still running in real-time. Visualization of the tracking results on LaSOT further demonstrates the capability of the proposed tracking method to cope with tracking challenges, e.g., illumination variation, deformation of the target and background clutter.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationMultimedia tools and applications, Dec. 2024, v. 83, no. 41, p. 89151-89172en_US
dcterms.isPartOfMultimedia tools and applicationsen_US
dcterms.issued2024-12-
dc.description.validate202405 bcrcen_US
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumbera2698-
dc.identifier.SubFormID48069-
dc.description.fundingSourceSelf-fundeden_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Zhang_TGLC_Visual_Object.pdf1.74 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

47
Citations as of Apr 14, 2025

Downloads

5
Citations as of Apr 14, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.