Please use this identifier to cite or link to this item:
Title: On representation based pattern classification models
Authors: Zhu, Pengfei
Degree: Ph.D.
Issue Date: 2015
Abstract: In computer vision and pattern recognition, there are a variety of image based classification tasks, e.g., face recognition, action recognition, object recognition, texture classification, handwritten digit recognition, etc. How to choose a suitable classifier for the given classification task is not a trivial problem, and it depends on data type, data distribution, data size, and feature property. According to "no free lunch" theorem in machine learning, there is no one classifier that can always achieve the state-of-the-art performance in all classification tasks. Intuitively, a robust, efficient, and scalable classifier with good understandability, scalability and generalization ability is always desired. Representation based classification has been widely used in pattern classification and achieves superior performance. It is based on the assumption that a query sample can be more accurately approximated by a linear combination of training samples of its class than other classes. Many representation based classification models have been developed, including sparse/collaborative representation, low-rank representation, robust representation, kernel representation, generic representation, multi-modal/cross-modal representation, etc. Representation residuals in these models are discriminative and a query sample can be classified to the class with the minimal reconstruction residual. Meanwhile, representation coefficients can also be used as features to enhance classification. In addition, in middle-level feature extraction, in contrast to vector quantization, sparse coding can be introduced to obtain a soft representation for classification. Although representation based classification models have achieved a great success in different classification tasks, there are still many problems remaining. When there are only a small number of training samples, the representation tends to be over-determined and therefore the query sample may not be well represented. When the number of the training samples is very large, the time complexity and memory consumption of representation based classifiers becomes a challenging issue. Besides, the existing representation based classifiers are mostly designed to accomplish single image based classification tasks. However, for video based face recognition and multi-view object recognition, the task becomes an image set classification problem. It is demanded to extend representation based classifiers from image based to image set based models. Finally, most existing representation based classifiers are non-discriminative in the representation process. It is interesting to investigate if the samples can be projected to a discriminative feature space to enhance the classification performance. In this thesis, we aim to develop new representation based classification models for small sample size problems, big sample size problems, image set classification problems, and discriminative representation problems, respectively. In Chapter 2, to solve the small sample size problem in face recognition, a patch based collaborative representation classifier (PCRC) is proposed. Both the query and gallery face images are divided into patches and then the query patch is represented by the gallery patch dictionary. Classification outputs of all the patches are combined by majority voting to get the final output. As PCRC is sensitive to patch size, a multi-scale PCRC is proposed to fuse the classification outputs of different path sizes by margin distribution optimization.
In Chapter 3, a local generic representation (LGR) based approach is proposed for face recognition with single sample per person. A generic intra-class variation dictionary is constructed from a generic dataset, and it can well compensate for the face variations lacked in the gallery set. A correntropy based metric is adopted to measure the loss of each patch so that the importance of different patches in face recognition can be more robustly evaluated. In Chapter 4, a self-representation induced classifier (SRIC) is proposed for representation with big sample size. Different from the existing sample-level representation, we proposed representation based classifiers from the perspective of feature-level representation. The time complexity of SRIC is only related with feature dimension and the number of classes. Hence, it is very suitable for classification tasks with a large amount of training samples and a small number of classes. In Chapter 5, an image set based collaborative representation model is proposed for image set based face recognition. Considering the distinctiveness of samples in the query image set and the correlation between the gallery image sets, we model both the query and gallery image set as hulls. Then the hull of the query image set is collaboratively represented on the gallery image sets. Regularized hull and kernel convex hull are both considered to develop robust image set based collaborative representation classifiers. In Chapter 6, by considering representation based classifiers as point-to-set distance based classifiers, we extended distance metric learning from point-to-point distance to point-to-set and set-to-set distance. The metric learning problem is modeled as a sample pair classification task and can be efficiently solved by standard support vector machine solvers. To sum up, in this thesis we developed patch based collaborative representation, local generic representation, regularized self-representation, image set based collaborative representation, and point-to-set/set-to-set distance metric learning methods to address the representation problems with small sample size, big sample size, and image sets for pattern recognition, respectively. Our extensive experimental results demonstrated the state-of-the-art performance of the proposed methods. In the future work, we will investigate generic dictionary learning for face recognition in the wild, cross-modal/multi-modal dictionary learning and metric learning methods under the representation based pattern classification framework.
Subjects: Pattern recognition systems
Hong Kong Polytechnic University -- Dissertations
Pages: xviii, 192 leaves : illustrations ; 30 cm
Appears in Collections:Thesis

Show full item record

Page views

Last Week
Last month
Citations as of Nov 26, 2023

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.