Please use this identifier to cite or link to this item:
Title: Learning a lightweight convolutional neural network for visual tracking and facial attribute analysis
Authors: Zhu, Linnan
Degree: M.Phil.
Issue Date: 2017
Abstract: In this thesis, we study the problems of object tracking and facial attribute analysis, in particular age and gender recognition. For object tracking, recently CNN based trackers have been proposed to improve tracking performance. Despite achieving state-of-the-art performance, existing CNN trackers still have many drawbacks. 1) Most of these methods utilize two separated CNNs for each input, while this strategy will increase much the number of model parameters, which consequently requires more labeled samples at the training stage. 2) Some CNN trackers can run at over 100 fps on GPU, but run very slowly on CPU due to the high complexity of network structure. In order to deal with these issues, in this thesis we propose a novel frame-pair based CNN architecture, which can balance tracking speed and accuracy. Instead of adopting two-stream CNNs, we fuse frame pairs in the input stage, resulting in a single-stream CNN tracker with much fewer parameters. The proposed tracker can learn generic motion patterns of objects with less video data compared with previous CNN based methods. The evaluation is conducted on the VOT14, OTB50 and OTB100 benchmark datasets. The proposed tracker achieves competitive results with state-of-the-arts but with much less memory and complexity. Our tracker can track objects in a speed of over 100 (30) fps with a GPU (CPU), much faster than most existing CNN based trackers. For age and gender recognition, CNN based methods have achieved state-of-the-art accuracy but they are time consuming for mobiles or low-end PCs for the following two issues. 1) Complex CNN architecture. Most of CNN based methods directly employ the popular architectures (e.g., AlexNet and VGG), which are very complex and overdesigned for age and gender recognition. 2) Regarding age and gender recognition as two independent problems. Actually, age and gender recognition are two highly correlated tasks about facial attributes, and it will be beneficial if we can optimize these two tasks together. In this thesis, we propose a lightweight deep model to recognize age and gender from a face image via a joint regression model. Specifically, our model employs a multi-task learning scheme to learn shared features for these two correlated tasks in an end-to-end manner. Extensive experimental results on the recent Adience benchmark demonstrate that our model achieves competitive recognition accuracy with the state-of-the-art methods but with much faster speed, i.e., about 10 times faster in the testing phase.
Subjects: Hong Kong Polytechnic University -- Dissertations
Human face recognition (Computer science)
Image processing -- Digital techniques
Image analysis -- Data processing
Pages: xiv, 79 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Page views

Last Week
Last month
Citations as of May 28, 2023

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.