MMCoQA : conversational question answering over text, tables, and images

Li, Y; Li, W; Nie, L

doi:10.18653/v1/2022.acl-long.290

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/112586

DC Field	Value	Language
dc.contributor	Department of Computing	en_US
dc.creator	Li, Y	en_US
dc.creator	Li, W	en_US
dc.creator	Nie, L	en_US
dc.date.accessioned	2025-04-17T06:34:43Z	-
dc.date.available	2025-04-17T06:34:43Z	-
dc.identifier.uri	http://hdl.handle.net/10397/112586	-
dc.description	60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, May 22-27, 2022	en_US
dc.language.iso	en	en_US
dc.publisher	Association for Computational Linguistics	en_US
dc.rights	©2022 Association for Computational Linguistics	en_US
dc.rights	Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License. (https://creativecommons.org/licenses/by/4.0/)	en_US
dc.rights	The following publication Li, Y., Li, W., & Nie, L. (2022, May). MMCoQA: Conversational Question Answering over Text, Tables, and Images. In S. Muresan, P. Nakov, & A. Villavicencio, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Dublin, Ireland, 4220-4231 is available at https://doi.org/10.18653/v1/2022.acl-long.290.	en_US
dc.title	MMCoQA : conversational question answering over text, tables, and images	en_US
dc.type	Conference Paper	en_US
dc.identifier.spage	4220	en_US
dc.identifier.epage	4231	en_US
dc.identifier.volume	1	en_US
dc.identifier.doi	10.18653/v1/2022.acl-long.290	en_US
dcterms.abstract	The rapid development of conversational assistants accelerates the study on conversational question answering (QA). However, the existing conversational QA systems usually answer users’ questions with a single knowledge source, e.g., paragraphs or a knowledge graph, but overlook the important visual cues, let alone multiple knowledge sources of different modalities. In this paper, we hence define a novel research task, i.e., multimodal conversational question answering (MMCoQA), aiming to answer users’ questions with multimodal knowledge sources via multi-turn conversations. This new task brings a series of research challenges, including but not limited to priority, consistency, and complementarity of multimodal knowledge. To facilitate the data-driven approaches in this area, we construct the first multimodal conversational QA dataset, named MMConvQA. Questions are fully annotated with not only natural language answers but also the corresponding evidence and valuable decontextualized self-contained questions. Meanwhile, we introduce an end-to-end baseline model, which divides this complex research task into question understanding, multi-modal evidence retrieval, and answer extraction. Moreover, we report a set of benchmarking results, and the results indicate that there is ample room for improvement.	en_US
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	In S. Muresan, P. Nakov, A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), p. 4220-4231. Stroudsburg, PA: Association for Computational Linguistics (ACL), 2022	en_US
dcterms.issued	2022	-
dc.identifier.scopus	2-s2.0-85133856550	-
dc.relation.ispartofbook	Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)	en_US
dc.relation.conference	Association for Computational Linguistics [ACL]	en_US
dc.description.validate	202504 bcch	en_US
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	OA_Others	-
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	National Natural Science Foundation of China (62076212); PolyU internal grants (ZVQ0)	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	CC	en_US
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
2022.acl-long.290.pdf		5.93 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Google ScholarTM

Altmetric

Google Scholar^TM