Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/61234
Title: Visual understanding via multi-feature shared learning with global consistency
Authors: Zhang, L 
Zhang, D 
Keywords: Multi-feature learning
Multimedia understanding
Semi-supervised learning
Visual recognition
Issue Date: 2016
Publisher: Institute of Electrical and Electronics Engineers
Source: IEEE transactions on multimedia, 2016, v. 18, no. 2, 7360941, p. 247-259 How to cite?
Journal: IEEE transactions on multimedia 
Abstract: Image/video data is usually represented with multiple visual features. Fusion of multi-source information for establishing attributes has been widely recognized. Multi-feature visual recognition has recently received much attention in multimedia applications. This paper studies visual understanding via a newly proposed ℓ2-norm-based multi-feature shared learning framework, which can simultaneously learn a global label matrix and multiple sub-classifiers with the labeled multi-feature data. Additionally, a group graph manifold regularizer composed of the Laplacian and Hessian graph is proposed. It can better preserve the manifold structure of each feature, such that the label prediction power is much improved through semi-supervised learning with global label consistency. For convenience, we call the proposed approach global-label-consistent classifier (GLCC). The merits of the proposed method include the following: 1) the manifold structure information of each feature is exploited in learning, resulting in a more faithful classification owing to the global label consistency; 2) a group graph manifold regularizer based on the Laplacian and Hessian regularization is constructed ; and 3) an efficient alternative optimization method is introduced as a fast solver owing its speed to convex sub-problems. Experiments on several benchmark visual datasets-the 17-category Oxford Flower dataset, the challenging 101-category Caltech dataset, the YouTube and Consumer Videos dataset, and the large-scale NUS-WIDE dataset-have been used for multimedia understanding. The results demonstrate that the proposed approach compares favorably with state-of-the-art algorithms. An extensive experiment using the deep convolutional activation features also shows the effectiveness of the proposed approach. The code will be available on http://www.escience.cn/people/lei/index.html.
URI: http://hdl.handle.net/10397/61234
ISSN: 1520-9210
EISSN: 1941-0077
DOI: 10.1109/TMM.2015.2510509
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

12
Last Week
0
Last month
Citations as of Nov 30, 2017

WEB OF SCIENCETM
Citations

10
Last Week
0
Last month
Citations as of Dec 10, 2017

Page view(s)

52
Last Week
1
Last month
Checked on Dec 11, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.