Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/105525
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Computing-
dc.creatorHuang, Q-
dc.creatorWei, J-
dc.creatorCai, Y-
dc.creatorZheng, C-
dc.creatorChen, J-
dc.creatorLeung, HF-
dc.creatorLi, Q-
dc.date.accessioned2024-04-15T07:34:51Z-
dc.date.available2024-04-15T07:34:51Z-
dc.identifier.isbn978-1-952148-25-5-
dc.identifier.urihttp://hdl.handle.net/10397/105525-
dc.description58th Annual Meeting of the Association for Computational Linguistics, Online, July 5th-10th, 2020en_US
dc.language.isoenen_US
dc.publisherAssociation for Computational Linguistics (ACL)en_US
dc.rights© 2020 Association for Computational Linguisticsen_US
dc.rightsThis publication is licensed on a Creative Commons Attribution 4.0 International License. (https://creativecommons.org/licenses/by/4.0/)en_US
dc.rightsThe following publication Qingbao Huang, Jielong Wei, Yi Cai, Changmeng Zheng, Junying Chen, Ho-fung Leung, and Qing Li. 2020. Aligned Dual Channel Graph Convolutional Network for Visual Question Answering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7166–7176, Online. Association for Computational Linguistics is available at https://doi.org/10.18653/v1/2020.acl-main.642.en_US
dc.titleAligned dual channel graph convolutional network for visual question answeringen_US
dc.typeConference Paperen_US
dc.identifier.spage7166-
dc.identifier.epage7176-
dc.identifier.doi10.18653/v1/2020.acl-main.642-
dcterms.abstractVisual question answering aims to answer the natural language question about a given image. Existing graph-based methods only focus on the relations between objects in an image and neglect the importance of the syntactic dependency relations between words in a question. To simultaneously capture the relations between objects in an image and the syntactic dependency relations between words in a question, we propose a novel dual channel graph convolutional network (DC-GCN) for better combining visual and textual advantages. The DC-GCN model consists of three parts: an I-GCN module to capture the relations between objects in an image, a Q-GCN module to capture the syntactic dependency relations between words in a question, and an attention alignment module to align image representations and question representations. Experimental results show that our model achieves comparable performance with the state-of-the-art approaches.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationIn Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, p. 7166-7176. Stroudsburg, PA, USA: Association for Computational Linguistics (ACL), 2020-
dcterms.issued2020-
dc.relation.ispartofbookProceedings of the 58th Annual Meeting of the Association for Computational Linguistics-
dc.relation.conferenceAnnual Meeting of the Association for Computational Linguistics [ACL]-
dc.description.validate202402 bcch-
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumberCOMP-0282en_US
dc.description.fundingSourceRGCen_US
dc.description.fundingSourceOthersen_US
dc.description.fundingTextInternal research grant from PolyUen_US
dc.description.pubStatusPublisheden_US
dc.identifier.OPUS49984830en_US
dc.description.oaCategoryCCen_US
Appears in Collections:Conference Paper
Files in This Item:
File Description SizeFormat 
2020.acl-main.642.pdf3.81 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

98
Last Week
3
Last month
Citations as of Nov 30, 2025

Downloads

24
Citations as of Nov 30, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.