Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/112819
PIRA download icon_1.1View/Download Full Text
Title: Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching
Authors: Guan, F
Zhao, N
Fang, Z
Jiang, L
Zhang, J
Yu, Y 
Huang, H
Issue Date: 2025
Source: Geo-spatial information science (地球空间信息科学学报), Published online: 17 Jan 2025, Latest Articles, https://doi.org/10.1080/10095020.2024.2439385
Abstract: Cross-view matching refers to the use of images from different platforms (e.g. drone and satellite views) to retrieve the most relevant images, where the key is that the viewpoints and spatial resolution. However, most of the existing methods focus on extracting fine-grained features and ignore the connection of contextual information in the image. Therefore, we propose a novel ConvNeXt-based multi-level representation learning model for the solution of this task. First, we extract global features through the ConvNeXt model. In order to obtain a joint part-based representation learning from the global features, we then replicated the obtained global features, operating one copy with spatial attention and the other copy using a standard convolutional operation. In addition, the features of different branches are aggregated through the multilevel feature fusion module to prepare for cross-view matching. Finally, we created a new hybrid loss function to better limit these features and assist in mining crucial data regarding global features. The experimental results indicate that we have achieved advanced performance on two common datasets, University-1652 and SUES-200 at 89.79% and 95.75% in drone target matching and 94.87% and 98.80 in drone navigation.
Keywords: ConvNeXt
Cross-view matching
Drone view
Multilevel feature
Satellite view
Publisher: Taylor & Francis Asia Pacific (Singapore)
Journal: Geo-spatial information science (地球空间信息科学学报) 
ISSN: 1009-5020
EISSN: 1993-5153
DOI: 10.1080/10095020.2024.2439385
Rights: © 2025 Wuhan University. Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s) or with their consent.
The following publication Guan, F., Zhao, N., Fang, Z., Jiang, L., Zhang, J., Yu, Y., & Huang, H. (2025). Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching. Geo-Spatial Information Science, 1–14 is available at https://doi.org/10.1080/10095020.2024.2439385.
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Guan_Multi-level_Representation_Learning.pdf18.48 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

5
Citations as of Dec 19, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.