Please use this identifier to cite or link to this item:
Title: New insights into meta-learning : from theory to algorithms
Authors: Chen, Jiaxin
Degree: Ph.D.
Issue Date: 2021
Abstract: Deep learning has surpassed human-level performance in various domains where the training data is abundant. Human intelligence, however, has the ability to adapt to a new environment with little experiences or recognize new categories after seeing just a few training samples. To endow machines such humanlike few-shot learning skills, meta-learning provides promising solutions and has generated a surge of interest recently. A meta-learning algorithm (meta-algorithm) trains over a large number of i.i.d. tasks sampled from a task distribution and learns an algorithm (inner-task algorithm) that can quickly adapt to a future task with few training data. From a generalization view, traditional supervised learning studies the generalization of a learned hypothesis (predictor) to novel data samples in a given task, whereas meta-learning explores the generalization of a learned algorithm to novel tasks. In this thesis, we provide new insights into the generalization of modern meta-learning based on theoretical analysis and propose new algorithms to improve its generalization ability. We provide theoretical investigations into Support/Query (S/Q) Episodic Training Strategy which is widely believed to improve the generalization and applied in modern meta-learning algorithms. We analyze the generalization error bound of generic meta-learning algorithms trained with such strategy via a stability analysis. We show that the S/Q episodic training strategy naturally leads to a counterintuitive generalization bound of O(1/√n), which only depends on the number of task n but independent of the inner-task sample size m. Under the common assumption m << n for few-shot learning, the bound of O(1/√n) implies strong generalization guarantee for modern meta-learning algorithms in the few-shot regime. We further point out that there still exist limitations of the existing modern meta-training strategy, i.e., the optimization procedure of model parameters following the strategy of differentiating through the inner-task optimization path. To satisfy this requirement, the inner-task algorithms should be solved analytically and this significantly limits the capacity and performance of the learned inner-task algorithms. Hence, we propose an adaptation-agnostic meta-training strategy that removes such dependency and can be used to train inner-task algorithms with or without analytical expressions. Such general meta-training strategy naturally leads to an ensemble framework that can efficiently combine various types of algorithms to achieve better generalization. Motivated by a closer look at metric-based meta-algorithms which shows high generalization ability over the few-shot classification problems, we propose a generic variational metric scaling framework which is compatible with metric-based meta-algorithms and achieves consistent improvements over the standard few-shot classification benchmarks. Finally, as majority of existing meta-algorithms focus on within-domain generalization, we further consider cross-domain generalization over a realistic yet more challenging few-shot classification problem, where a large discrepancy exists between the task distributions of training domains and a test domain. A gradient-based hierarchical meta-learning framework (M2L) is proposed to solve such problem. Finally, we discuss open questions and future directions in meta-learning.
Subjects: Machine learning
Data mining
Hong Kong Polytechnic University -- Dissertations
Pages: ix, 86 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Page views

Citations as of May 15, 2022

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.