Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/91809
Title: New insights into meta-learning : from theory to algorithms
Authors: Chen, Jiaxin
Degree: Ph.D.
Issue Date: 2021
Abstract: Deep learning has surpassed human-level performance in various domains where the training data is abundant. Human intelligence, however, has the ability to adapt to a new environment with little experiences or recognize new categories after seeing just a few training samples. To endow machines such humanlike few-shot learning skills, meta-learning provides promising solutions and has generated a surge of interest recently. A meta-learning algorithm (meta-algorithm) trains over a large number of i.i.d. tasks sampled from a task distribution and learns an algorithm (inner-task algorithm) that can quickly adapt to a future task with few training data. From a generalization view, traditional supervised learning studies the generalization of a learned hypothesis (predictor) to novel data samples in a given task, whereas meta-learning explores the generalization of a learned algorithm to novel tasks. In this thesis, we provide new insights into the generalization of modern meta-learning based on theoretical analysis and propose new algorithms to improve its generalization ability. We provide theoretical investigations into Support/Query (S/Q) Episodic Training Strategy which is widely believed to improve the generalization and applied in modern meta-learning algorithms. We analyze the generalization error bound of generic meta-learning algorithms trained with such strategy via a stability analysis. We show that the S/Q episodic training strategy naturally leads to a counterintuitive generalization bound of O(1/√n), which only depends on the number of task n but independent of the inner-task sample size m. Under the common assumption m << n for few-shot learning, the bound of O(1/√n) implies strong generalization guarantee for modern meta-learning algorithms in the few-shot regime. We further point out that there still exist limitations of the existing modern meta-training strategy, i.e., the optimization procedure of model parameters following the strategy of differentiating through the inner-task optimization path. To satisfy this requirement, the inner-task algorithms should be solved analytically and this significantly limits the capacity and performance of the learned inner-task algorithms. Hence, we propose an adaptation-agnostic meta-training strategy that removes such dependency and can be used to train inner-task algorithms with or without analytical expressions. Such general meta-training strategy naturally leads to an ensemble framework that can efficiently combine various types of algorithms to achieve better generalization. Motivated by a closer look at metric-based meta-algorithms which shows high generalization ability over the few-shot classification problems, we propose a generic variational metric scaling framework which is compatible with metric-based meta-algorithms and achieves consistent improvements over the standard few-shot classification benchmarks. Finally, as majority of existing meta-algorithms focus on within-domain generalization, we further consider cross-domain generalization over a realistic yet more challenging few-shot classification problem, where a large discrepancy exists between the task distributions of training domains and a test domain. A gradient-based hierarchical meta-learning framework (M2L) is proposed to solve such problem. Finally, we discuss open questions and future directions in meta-learning.
Subjects: Machine learning
Data mining
Algorithms
Hong Kong Polytechnic University -- Dissertations
Pages: ix, 86 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Page views

10
Citations as of May 15, 2022

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.