Facial affect recognition : from feature engineering to deep learning

Chen, Junkai

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/83140

Title:	Facial affect recognition : from feature engineering to deep learning
Authors:	Chen, Junkai
Degree:	Ph.D.
Issue Date:	2017
Abstract:	Facial expression recognition has been a long standing problem and attracted growing interest from the affective computing community. This thesis presents the research I conducted for facial affect recognition with novel hand-crafted features and deep learning. Three main contributions are reported in this thesis. They include: (1) an effective approach with novel features for facial expression recognition in video; (2) a framework with multiple tasks for detecting and locating pain events in video; and (3) an effective method with a deep convolutional neural network for smile detection in the wild. In the first investigation, I propose novel features and an application of multi-kernel learning to combine multiple features for facial expression recognition in video. A new feature descriptor called Histogram of Oriented Gradients from Three Orthogonal Planes (HOG-TOP) is proposed to characterize facial appearance changes. A new effective geometric feature is also proposed to capture facial configuration changes. The role of audio modality on affect recognition is also explored. Multiple feature fusion is used to combine different features optimally. Experimental results show that our approach is robust in dealing with video-based facial expression recognition problems under lab-controlled environment and in the wild compared with the other state-of-the-art methods.In the second investigation, I propose an effective framework with multiple tasks for pain event detection and locating. Histogram of Oriented Gradients (HOG) of fiducial points (P-HOG) and HOG-TOP are used to characterize spatial features and dynamic textures from video frames and video segments. Both frame-level and segment-level detections are based on trained Support Vector Machines (SVMs). Max pooling strategy is further used to obtain the global P-HOG and global HOG-TOP, and an SVM with multiple kernels is trained for pain event detection. Finally, an effective probabilistic fusion method is proposed to integrate the three different tasks (frame, segment and sequence) to locate pain events in video. Experimental results show that the proposed method outperforms other state-of-the-art methods both in pain event detection and pain event locating in video. In the third investigation, I propose an effective approach for smile detection in the wild with deep learning. Deep learning can effectively combine feature learning and classification into a single model. In this study, a deep convolutional network called Smile-CNN is used to perform feature learning and smile detection simultaneously. I also discuss the discriminative power of the learned features from the Smile-CNN model. By feeding the learned features to train an SVM or AdaBoost classifier, I show that the learned features have impressive discriminative power. Experimental results show that the proposed approach can achieve a promising performance in smile detection.
Subjects:	Hong Kong Polytechnic University -- Dissertations Human face recognition (Computer science) Pattern recognition systems Machine learning
Pages:	xxii, 152 pages : color illustrations
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/8989

Show full item record

Page views

235

Last Week
1

Last month

Citations as of Oct 26, 2025

Google Scholar^TM

Check

Access

Page views

Google ScholarTM

Google Scholar^TM