Please use this identifier to cite or link to this item:
Title: Understanding human comprehension and attention in reading
Authors: Li, Jiajia
Advisors: Ngai, Grace (COMP)
Chan, C. F. Stephen (COMP)
Keywords: Reading -- Physiological aspects
Attention -- Physiological aspects
Eye -- Movements
Human-computer interaction
Issue Date: 2017
Publisher: The Hong Kong Polytechnic University
Abstract: Reading is one of the most common computer interaction activities and also one of the most fundamental means of knowledge acquisition. With the development of computing technologies and the growing popularity of e-Learning platforms, understanding human attention and comprehension through reading behaviors has the potential to become an important means to enhance the learning experience and effectiveness. Eye gaze pattern is known to play an important role in the study of reading behaviors since reading can be considered as a task where visual processing and sensorimotor control takes place in a highly structured visual environment [79]. Many studies have shown that eye movement and eye behavior during reading is closely related to cognitive human mental states, such as comprehension and attention [81][88][89]. There are two main drawbacks in current state-of-the-art research on comprehension and attention detection based on eye gaze patterns. First, many of them use expensive and intrusive devices, like the electrooculography systems, to track the eye movement, or detect the user's mental state as ground truth, through the use of electroencephalography (EEG) devices. Second, numerous methods study how lexical and linguistic variables affect the eye gaze behavior during reading. These methods therefore rely on the availability of linguistic analysis of the reading materials. Addressing the limitations mentioned above, we conduct experiments with human subjects and do an in-depth study of eye gaze patterns related to the change of comprehension level and attention level during reading. Both Tobii eye tracker and off-the-shelf webcam are used to capture the eye gaze signals based on which the eye gaze features are extracted. By adopting machine learning algorithms, we conduct feature evaluation and compare the classification performance with different kinds of eye gaze features. From the investigation, we have a better understanding of relation between the studied human mental states, i.e. comprehension and attention, and certain eye gaze patterns. We also find that the features extracted based on accurate eye gaze location on the screen captured by Tobii eye tracker contribute more to the comprehension and attention level detection during reading.
In order to recognize human mental states, input signals reflecting human mental states need to be acquired and processed. Under traditional KVM (keyboard-video-mouse) settings, input signals are mostly tied to keyboard and mouse dynamics. One can deduce some information about human mental states and affects from keyboard [12][111] and mouse [110][123], but the accuracy is not particularly high. Thanks to the popularity of interactive social networking applications, the webcam has become a de facto device. Recent research in video processing and machine learning has demonstrated that human affects can be recognized via webcam video, noticeably via human facial features [127]. Inspired by previous research, we look into other modalities, i.e. facial expressions and mouse dynamics, for attention detection during reading. A two-level facial feature extraction approach is proposed to represent the static and dynamic states of the facial expressions of the subjects. Similarly, the mouse dynamic features are extracted from the captured log mouse events and evaluated for reading attention detection. To evaluate our method, we apply machine learning techniques to build up user-independent models to recognize human attention and comprehension level on reading tasks. We compare the performances of models built on single modality and multiple modalities. The findings suggest that the multimodal approach outperforms the unimodal approach in our studies. The results also demonstrate that eye gaze pattern and facial expressions show more potential in predicting attention level than the mouse dynamics, which may be caused by the rare usage of mouse as an input device in the reading task.
Description: xvi, 119 pages : color illustrations
PolyU Library Call No.: [THS] LG51 .H577P COMP 2017 LiJ
Rights: All rights reserved.
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
991021965756703411_link.htmFor PolyU Users167 BHTMLView/Open
991021965756703411_pira.pdfFor All Users (Non-printable)2.83 MBAdobe PDFView/Open
Show full item record

Page view(s)

Last Week
Last month
Citations as of Mar 19, 2018


Citations as of Mar 19, 2018

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.