Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/105180
| Title: | Towards interactive information seeking : conversational question answering | Authors: | Li, Yongqi | Degree: | Ph.D. | Issue Date: | 2024 | Abstract: | In recent years, the rise of machine learning techniques has accelerated the development of conversational agents. These conversational agents provide a natural and convenient way for people to chit-chat, complete well-specified tasks, and seek information in their daily lives. People often prefer to ask conversational questions when they have complex information needs or are of interest to certain broad topics. Although current personal assistant systems are capable of completing tasks and even conducting small talk, they cannot handle information-seeking conversations with complicated information needs that require multiple turns of interaction. It is, therefore, essential to endow conversational agents with the capability of answering conversational questions, which introduces a broad and new research area, namely conversational question answering (CoQA). Given many advantages of CoQA, however, it has proven to be significantly more challenging compared with traditional QA. Specifically, we identify three main research problems to be addressed in conversational QA, i.e., 1. How to explore the dynamic interaction of information content and communication intent for thorough conversation context modeling? 2. How to accurately identify users’ information needs within conversations and retrieve relevant documents from the web accordingly? 3. How to enrich the textual conversational QA with multimodal evidence, like images and tables, and effectively model the complex relations among multimodal items? To address the aforementioned problems, we focus on developing effective conversational QA systems from the following three aspects: multiple conversation flow tracking, generative document retrieval, and multimodal knowledge enhancement. In accordance, the research works in this thesis are organized into three parts. In the first part (work 1 and 2), we investigate research problem 1, which is the core research problem of conversational QA. We innovatively model the intricate interactions among conversations into multiple conversation flows to enhance the question answering. In work 1, we delve into leveraging the conversation context to aid in locating answers in the current-turn. Due to the coherent nature of conversational questions, the corresponding answers tend to be related and organized within logically connected passages on the web. We claim this coherence among answers as answer flow and utilize it to improve the current-turn QA. In work 2, we take the concept of flow in conversational question answering to a new level. We introduce three distinct conversation flows, namely question flow, topic flow, and answer flow. These flows are proposed and utilized to enhance the current-turn Question-Answering (QA) process by considering multi-level information transitions. The second part (work 3, 4, and 5) explores solutions for problem 2 with the aim of more effective passage retrieval in conversational QA. In this part, we explore a new retrieval paradigm, generative retrieval, and demonstrate its effectiveness in the conversational setting. In work 3, we first propose multiview identifiers enhanced generative retrieval framework, which unifies previous generative retrieval methods in one framework and achieves state-of-the-art performance. In work 4, we further enhance the proposed framework in work 3 by bridging it with the classical learning-to-rank paradigm. In work 5, we apply the previously proposed generative retrieval method to the conversational QA task and demonstrate the excellent superiority of generative retrieval in conversational QA. The third part (work 6) delves into problem 3 and investigates the multimodal conversational QA problem. Previous conversational QA systems usually rely on one specific knowledge source, which overlooks the visual evidence and does not take the multimodal knowledge sources into account. In work 6, we hence define a novel research task, i.e., multimodal conversational question answering (MMCoQA), aiming to answer users’ questions with multimodal knowledge sources via multi-turn conversations. This new task brings a series of research challenges, including but not limited to priority, consistency, and complementarity of multimodal knowledge. To facilitate the data-driven approaches in this area, we construct the first multimodal conversational QA dataset, named MMConvQA, and introduce a multimodal conversational QA model. In summary, we study the problem of conversational QA in a systematic way. We demonstrate the effectiveness of the proposed approaches on real-world datasets, which implies the potential of our works when applied in real-world scenarios. |
Subjects: | Question-answering systems Intelligent agents (Computer software) Natural language generation (Computer science) Hong Kong Polytechnic University -- Dissertations |
Pages: | xvi, 185 pages : color illustrations |
| Appears in Collections: | Thesis |
Access
View full-text via https://theses.lib.polyu.edu.hk/handle/200/12892
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


