Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/70360
Title: Learning a lightweight convolutional neural network for visual tracking and facial attribute analysis
Authors: Zhu, Linnan
Advisors: Zhang, Lei (COMP)
Keywords: Human face recognition (Computer science)
Image processing -- Digital techniques
Image analysis -- Data processing
Issue Date: 2017
Publisher: The Hong Kong Polytechnic University
Abstract: In this thesis, we study the problems of object tracking and facial attribute analysis, in particular age and gender recognition. For object tracking, recently CNN based trackers have been proposed to improve tracking performance. Despite achieving state-of-the-art performance, existing CNN trackers still have many drawbacks. 1) Most of these methods utilize two separated CNNs for each input, while this strategy will increase much the number of model parameters, which consequently requires more labeled samples at the training stage. 2) Some CNN trackers can run at over 100 fps on GPU, but run very slowly on CPU due to the high complexity of network structure. In order to deal with these issues, in this thesis we propose a novel frame-pair based CNN architecture, which can balance tracking speed and accuracy. Instead of adopting two-stream CNNs, we fuse frame pairs in the input stage, resulting in a single-stream CNN tracker with much fewer parameters. The proposed tracker can learn generic motion patterns of objects with less video data compared with previous CNN based methods. The evaluation is conducted on the VOT14, OTB50 and OTB100 benchmark datasets. The proposed tracker achieves competitive results with state-of-the-arts but with much less memory and complexity. Our tracker can track objects in a speed of over 100 (30) fps with a GPU (CPU), much faster than most existing CNN based trackers. For age and gender recognition, CNN based methods have achieved state-of-the-art accuracy but they are time consuming for mobiles or low-end PCs for the following two issues. 1) Complex CNN architecture. Most of CNN based methods directly employ the popular architectures (e.g., AlexNet and VGG), which are very complex and overdesigned for age and gender recognition. 2) Regarding age and gender recognition as two independent problems. Actually, age and gender recognition are two highly correlated tasks about facial attributes, and it will be beneficial if we can optimize these two tasks together. In this thesis, we propose a lightweight deep model to recognize age and gender from a face image via a joint regression model. Specifically, our model employs a multi-task learning scheme to learn shared features for these two correlated tasks in an end-to-end manner. Extensive experimental results on the recent Adience benchmark demonstrate that our model achieves competitive recognition accuracy with the state-of-the-art methods but with much faster speed, i.e., about 10 times faster in the testing phase.
Description: xiv, 79 pages : color illustrations
PolyU Library Call No.: [THS] LG51 .H577M COMP 2017 Zhu
URI: http://hdl.handle.net/10397/70360
Rights: All rights reserved.
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
991021965755003411_link.htmFor PolyU Users167 BHTMLView/Open
991021965755003411_pira.pdfFor All Users (Non-printable)2.2 MBAdobe PDFView/Open
Show full item record
PIRA download icon_1.1View/Download Contents

Page view(s)

47
Last Week
2
Last month
Citations as of Oct 15, 2018

Download(s)

8
Citations as of Oct 15, 2018

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.