Sharing to Learn and Learning to Share: Meta-learning to Enhance Multi-task Learning
2025 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]
Multi-Task Learning (MTL) enables simultaneous learning of multiple tasks in a shared framework following the principle of ‘sharing to learn,’ to improve the performance of all the tasks. Despite its advantages, MTL also comes with several challenges. One critical issue is negative transfer, where training on one task may degrade the performance of other tasks due to conflicts in feature representations or the sharing of incompatible knowledge between tasks. MTL requires careful optimization of knowledge sharing between tasks, as over-sharing may lead to task interference, and under-sharing may prevent important knowledge transfer. Additionally, MTL systems also struggle with scalability when adapting to new tasks without retraining.
The main contributions of this thesis focus on how meta-learning can enhance MTL by addressing the above-mentioned key challenges. Meta-learning leverages the knowledge from learning multiple tasks to dynamically adapt the learning process for newtasks. Thereby focusing on ‘learning (what) to share.’ This work introduces solutions to create a unified framework by combining the adaptability of meta-learning with the task-sharing capabilities of MTL to promote effective MTL. The contributions of this thesis are organized in three dimensions. The first dimension focuses on combining MTLand meta-learning, resulting in the Multi-Task Meta-Learning (MTML) framework. The second dimension introduces structured sparsity to MTL, leading to the development of Layer-Optimized Multi-Task (LOMT) models, Structured Parameter Sparsity for Efficient Multi-Task Learning (SPARSE-MTL), and, finally, meta-sparsity. The third dimension investigates soft parameter sharing for multi-modal, multi-task feature alignment, enabling effective collaboration between different modalities.
The first major contribution (C1) is MTML, a framework that employs meta-learning to enable adaptive and efficient knowledge sharing across tasks in MTL. It uses a bi-level meta-optimization strategy to dynamically balance task-specific and shared knowledge; thereby allowing the network to learn faster and generalize better to unseen tasks while maintaining good performance across multiple tasks. This thesis further introduces the structured sparsity technique, particularly channel-wise group sparsity in multi-task settings, resulting in LOMT models (C2) and SPARSE-MTL (C3). Structured sparsity reduces redundant parameters in shared architectures of MTL, with the aim of preventing over-sharing by optimizing feature sharing across all the tasks. However, managing the sparsity level across tasks is a challenge, as the optimal degree of sparsification varies for different tasks and task combinations. To address this, meta-sparsity (C4), an extension of SPARSE-MTL and MTML, incorporates meta-learning to dynamically learn the optimal sparsity patterns across tasks. This ensures the efficient sharing of features while minimizing task interference.
In addition to the hard parameter sharing approaches, this thesis also explores soft parameter sharing for multi-modal data through a multi-task multi-modal feature alignment approach (C5). This method focuses on object detection tasks across RGB and infrared (IR) modalities. This approach aims to achieve effective knowledge sharing by employing channel-wise structured regularization to higher network layers to align semantic features between modalities and retain modality-specific features in lower layers. Therefore, it leverages the complementary strengths of RGB and IR data to enhance object detection performance across diverse conditions.
In summary, this thesis demonstrates how the integration of meta-learning and structured sparsity addresses fundamental challenges in MTL, resulting in more adaptable, efficient, and scalable systems. On a broader scale, this thesis also paves the way for parsimonious multi-task models, contributing to sustainable machine learning.
Place, publisher, year, edition, pages
Luleå: Luleå University of Technology, 2025.
Series
Doctoral thesis / Luleå University of Technology 1 jan 1997 → …, ISSN 1402-1544
Keywords [en]
Multi-task learning, Meta learning, Structured sparsity, Feature alignment
National Category
Computer Systems
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-111261ISBN: 978-91-8048-730-6 (print)ISBN: 978-91-8048-731-3 (electronic)OAI: oai:DiVA.org:ltu-111261DiVA, id: diva2:1926425
Public defence
2025-03-10, A 117, Luleå University of Technology, Luleå, 09:00 (English)
Opponent
Supervisors
2025-01-132025-01-112025-02-03Bibliographically approved