Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/107968
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electrical and Electronic Engineeringen_US
dc.creatorSenior, Hen_US
dc.creatorSlabaugh, Gen_US
dc.creatorYuan, Sen_US
dc.creatorRossi, Len_US
dc.date.accessioned2024-07-22T02:44:41Z-
dc.date.available2024-07-22T02:44:41Z-
dc.identifier.issn0178-2789en_US
dc.identifier.urihttp://hdl.handle.net/10397/107968-
dc.language.isoenen_US
dc.publisherSpringeren_US
dc.rights© The Author(s) 2024en_US
dc.rightsThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.en_US
dc.rightsThe following publication Senior, H., Slabaugh, G., Yuan, S. et al. Graph neural networks in vision-language image understanding: a survey. Vis Comput 41, 491–516 (2025) is available at https://doi.org/10.1007/s00371-024-03343-0.en_US
dc.subjectGraph neural networksen_US
dc.subjectImage captioningen_US
dc.subjectImage retrievalen_US
dc.subjectVisual question answeringen_US
dc.titleGraph neural networks in vision-language image understanding : a surveyen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage491en_US
dc.identifier.epage516en_US
dc.identifier.volume41en_US
dc.identifier.issue1en_US
dc.identifier.doi10.1007/s00371-024-03343-0en_US
dcterms.abstract2D image understanding is a complex problem within computer vision, but it holds the key to providing human-level scene comprehension. It goes further than identifying the objects in an image, and instead, it attempts to understand the scene. Solutions to this problem form the underpinning of a range of tasks, including image captioning, visual question answering (VQA), and image retrieval. Graphs provide a natural way to represent the relational arrangement between objects in an image, and thus, in recent years graph neural networks (GNNs) have become a standard component of many 2D image understanding pipelines, becoming a core architectural component, especially in the VQA group of tasks. In this survey, we review this rapidly evolving field and we provide a taxonomy of graph types used in 2D image understanding approaches, a comprehensive list of the GNN models used in this domain, and a roadmap of future potential developments. To the best of our knowledge, this is the first comprehensive survey that covers image captioning, visual question answering, and image retrieval techniques that focus on using GNNs as the main part of their architecture.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationVisual computer, Jan. 2025, v. 41, no. 1, p. 491-516en_US
dcterms.isPartOfVisual computeren_US
dcterms.issued2025-01-
dc.identifier.scopus2-s2.0-85188956287-
dc.identifier.eissn1432-2315en_US
dc.description.validate202407 bcchen_US
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumbera3057-
dc.identifier.SubFormID49302-
dc.description.fundingSourceSelf-fundeden_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryCCen_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
s00371-024-03343-0.pdf2.94 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

50
Citations as of Apr 14, 2025

Downloads

7
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

3
Citations as of Mar 27, 2025

WEB OF SCIENCETM
Citations

3
Citations as of Mar 27, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.