Back to results list
Show full item record
Please use this identifier to cite or link to this item:
|Title:||Learning representations with information on hand||Authors:||Jiang, Wenhao||Degree:||Ph.D.||Issue Date:||2014||Abstract:||Representation (e.g. feature or kernel) of samples is crucial for machine learning algorithms. Usually, the raw representation is not good enough to provide satisfactory performance. Hence, it is necessary to learn new and/or better representations for samples collected from different sources. In this thesis, three new methods for learning effective representation from multiple kernels, from auxiliary datasets and from the difference between domains respectively are proposed. The first one is on learning kernel from candidate kernels. Most dimensionality reduction techniques are based on one metric or one kernel, hence it is necessary to select an appropriate kernel for kernel-based dimensionality reduction. Multiple kernel learning for dimensionality reduction (MKL-DR) has been recently proposed to learn a kernel from a set of base kernels which are seen as different descriptions of data. As MKL-DR does not involve regularization, it might be ill-posed under some conditions and consequently its applications are hindered. As such, a multiple kernel learning framework for dimensionality reduction based on regularized trace ratio, termed as MKL-TR, is developed. Our method aims at learning a transformation into a space of lower dimension and a corresponding kernel from the given base kernels among which some may not be suitable for the given data. The solutions for the proposed framework can be found based on trace ratio maximization. The experimental results demonstrate its effectiveness in benchmark datasets, which include text, image and sound datasets, for supervised, unsupervised as well as semi-supervised settings.
The second one is on learning representation from auxiliary datasets for clustering. Transferring knowledge from auxiliary datasets has been proved useful in machine learning tasks. Its adoption in clustering however is still limited. Despite of its superior performance, spectral clustering has not yet been incorporated with knowledge transfer or transfer learning. We make such an attempt and propose a new algorithm called transfer spectral clustering (TSC). It involves not only the data manifold information of the clustering task but also the feature manifold information shared between related clustering tasks. Furthermore, it makes use of co-clustering to achieve and control the knowledge transfer among tasks. As demonstrated by the experimental results, TSC can greatly improve the clustering performance by effectively using auxiliary unlabeled data when compared with other state-of-the-art clustering algorithms. The third one is on learning representations from the difference between domains for domain adaptation. Recently, deep learning methods that employ stacked denoising auto-encoders (SDAs) have been successfully applied in domain adaptation. Remarkable performance in multi-domain sentiment analysis datasets has been reported, making deep learning a promising approach to domain adaptation problems. SDAs are distinguished by learning robust data representations for recovering the original features that have been artificially corrupted with noise. The idea has been further exploited to marginalize out the random corruptions by a state-of-the-art method called mSDA. In this thesis, a deep learning method for domain adaptation called regularized least squares denoising auto-encoder (RLSDA) is proposed. It works on the intuition that useful information should be increased layer by layer. By linking to principal component analysis (PCA), deep learning with RLSDA is formulated to have two independent steps in each layer, namely, regularized least square and nonlinear transformation. It generalizes mSDA and can be viewed and explained from a PCA perspective. The experimental results demonstrate that the proposed method is effective in multiple domain sentiment classification tasks which include Amazon review dataset, spam dataset from ECML/PKDD discovery challenge 2006 and 20 Newsgroups dataset. We also illustrate that the performance of the new method can be further improved by simply augmenting the features learned by RLSDA.
Hong Kong Polytechnic University -- Dissertations
|Pages:||xvi, 90 leaves : ill. ; 30 cm.|
|Appears in Collections:||Thesis|
View full-text via https://theses.lib.polyu.edu.hk/handle/200/7408
Citations as of Aug 7, 2022
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.