Optimizing knowledge transfer in continual and multi-task learning environments

Shi, Guangyuan

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/115135

DC Field	Value	Language
dc.contributor	Department of Computing	-
dc.creator	Shi, Guangyuan	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/13802	-
dc.language.iso	English	-
dc.title	Optimizing knowledge transfer in continual and multi-task learning environments	-
dc.type	Thesis	-
dcterms.abstract	Optimizing knowledge transfer is a key challenge in machine learning, especially in dynamic environments where tasks and data continually evolve. Conventional machine learning methods, generally rely on the premise that the feature space and data distribution remain consistent between the training and testing phases. In reality, this condition is rarely met, as real-world data often exhibits substantial variability. This limitation reduces the usability and effectiveness of models, particularly when training data is insufficient, tasks have diverse distributions, or environments change, necessitating model retraining. In such settings, models must handle multiple tasks simultaneously while managing diverse and potentially conflicting objectives. Moreover, it is essential for models to learn new tasks while retaining existing knowledge and to swiftly adjust to new situations or tasks with minimal retraining effort. This paper examines strategies for optimizing knowledge transfer in continual learning (CL) and multi-task learning (MTL) to improve their performance in practical applications.	-
dcterms.abstract	Continual learning (CL) enables models to learn from a series of tasks while retaining knowledge from earlier tasks, thus avoiding catastrophic forgetting, where new learning negatively impacts previously acquired knowledge. We propose a novel approach that guides models to converge to flat local minima during initial training, requiring minimal adjustments when adapting to new tasks. This strategy reduces the likelihood of forgetting and enhances model robustness in dynamic environments, making it particularly effective for applications requiring continual adaptation to new data and tasks.	-
dcterms.abstract	In multi-task learning (MTL), the challenge is to transfer knowledge across different tasks without causing negative transfer, where learning one task adversely affects performance on others. Negative transfer often arises from conflicting gradients during model updates, where the update direction is dominated by tasks with larger gradient magnitudes, hindering effective learning of other tasks. To mitigate this, we introduce a method that identifies layers with severe gradient conflicts and switches them from shared to task-specific configurations. This approach prevents gradient conflicts in shared layers, ensuring balanced learning and improving overall model performance and generalization across tasks.	-
dcterms.abstract	Additionally, considering the increasing size of pretrained base models and the rising costs associated with knowledge transfer, we introduce a parameter-efficient fine-tuning (PEFT) algorithm. This algorithm aims to optimize the adaptability of large language models (LLMs) by selectively fine-tuning only the most critical layers. By learning binary masks for each low-rank weight matrix used in LoRA--determining whether a layer needs a LoRA adapter, where a mask value of 0 indicates that no LoRA adapter is required and thus no change to the model parameters--our approach significantly reduces memory overhead and computational costs while avoiding overfitting. This makes transfer learning more efficient and feasible in resource-constrained environments.	-
dcterms.abstract	In summary, this thesis explores a set of complementary methods aimed at improving knowledge transfer in machine learning under practical constraints. By addressing critical challenges such as continual adaptation, task interference, and computational efficiency, the proposed approaches contribute to enhancing the robustness and practicality of transfer learning in real-world settings. These methods--focused on reducing forgetting in continual learning, mitigating gradient conflicts in multi-task learning, and improving parameter efficiency in fine-tuning large models--offer targeted solutions to common limitations in dynamic and resource-constrained environments. Experimental results support their effectiveness, showing improvements in model stability, generalization, and adaptability. Overall, this work offers both practical insights and methodological contributions that can inform future research and applications in scalable, efficient machine learning. The results have been published or submitted in various top AI conferences.	-
dcterms.accessRights	open access	-
dcterms.educationLevel	Ph.D.	-
dcterms.extent	xxii, 160 pages : color illustrations	-
dcterms.issued	2025	-
dcterms.LCSH	Machine learning	-
dcterms.LCSH	Reinforcement learning	-
dcterms.LCSH	Computer multitasking	-
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	-
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/13802

Show simple item record

Google Scholar^TM

Check

Access

Google ScholarTM

Google Scholar^TM