Change search
Link to record
Permanent link

Direct link
Publications (10 of 18) Show all publications
Upadhyay, R., Phlypo, R., Saini, R. & Liwicki, M. (2025). Giving Each Task what it Needs Leveraging Structured Sparsity for Tailored Multi-Task Learning. In: Alessio Del Bue; Cristian Canton; Jordi Pont-Tuset; Tatiana Tommasi (Ed.), Computer Vision – ECCV 2024 Workshops: Milan, Italy, September 29–October 4, 2024, Proceedings. Paper presented at 18th European Conference on Computer Vision (ECCV 2024), Milan, Italy, September 29 - October 4, 2024 (pp. 202-218). Springer Science and Business Media Deutschland GmbH, XI
Open this publication in new window or tab >>Giving Each Task what it Needs Leveraging Structured Sparsity for Tailored Multi-Task Learning
2025 (English)In: Computer Vision – ECCV 2024 Workshops: Milan, Italy, September 29–October 4, 2024, Proceedings / [ed] Alessio Del Bue; Cristian Canton; Jordi Pont-Tuset; Tatiana Tommasi, Springer Science and Business Media Deutschland GmbH , 2025, Vol. XI, p. 202-218Conference paper, Published paper (Refereed)
Abstract [en]

In the Multi-task Learning (MTL) framework, every task demands distinct feature representations, ranging from low-level to high-level attributes. It is vital to address the specific (feature/parameter) needs of each task, especially in computationally constrained environments. This work, therefore, introduces Layer-Optimized Multi-Task (LOMT) models that utilize structured sparsity to refine feature selection for individual tasks and enhance the performance of all tasks in a multi-task scenario. Structured or group sparsity systematically eliminates parameters from trivial channels and, sometimes, eventually, entire layers within a convolution neural network during training. Consequently, the remaining layers provide the most optimal features for a given task. In this two-step approach, we subsequently leverage this sparsity-induced optimal layer information to build the LOMT models by connecting task-specific decoders to these strategically identified layers, deviating from conventional approaches that uniformly connect decoders at the end of the network. This tailored architecture optimizes the network, focusing on essential features while reducing redundancy. We validate the efficacy of the proposed approach on two datasets, i.e., NYU-v2 and CelebAMask-HD datasets, for multiple heterogeneous tasks. A detailed performance analysis of the LOMT models, in contrast to the conventional MTL models, reveals that the LOMT models outperform for most task combinations. The excellent qualitative and quantitative outcomes highlight the effectiveness of employing structured sparsity for optimal layer (or feature) selection.

Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH, 2025
Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349 ; 15633
Keywords
Multi-task learning, group sparsity, feature selection, layer optimization
National Category
Computer graphics and computer vision
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-113965 (URN)10.1007/978-3-031-91979-4_16 (DOI)001544984800015 ()2-s2.0-105008008892 (Scopus ID)
Conference
18th European Conference on Computer Vision (ECCV 2024), Milan, Italy, September 29 - October 4, 2024
Note

ISBN for host publication: 978-3-031-91978-7, 978-3-031-91979-4

Available from: 2025-07-01 Created: 2025-07-01 Last updated: 2025-11-28Bibliographically approved
Upadhyay, R. (2025). Sharing to Learn and Learning to Share: Meta-learning to Enhance Multi-task Learning. (Doctoral dissertation). Luleå: Luleå University of Technology
Open this publication in new window or tab >>Sharing to Learn and Learning to Share: Meta-learning to Enhance Multi-task Learning
2025 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

Multi-Task Learning (MTL) enables simultaneous learning of multiple tasks in a shared framework following the principle of ‘sharing to learn,’ to improve the performance of all the tasks. Despite its advantages, MTL also comes with several challenges. One critical issue is negative transfer, where training on one task may degrade the performance of other tasks due to conflicts in feature representations or the sharing of incompatible knowledge between tasks. MTL requires careful optimization of knowledge sharing between tasks, as over-sharing may lead to task interference, and under-sharing may prevent important knowledge transfer. Additionally, MTL systems also struggle with scalability when adapting to new tasks without retraining.

The main contributions of this thesis focus on how meta-learning can enhance MTL by addressing the above-mentioned key challenges. Meta-learning leverages the knowledge from learning multiple tasks to dynamically adapt the learning process for newtasks. Thereby focusing on ‘learning (what) to share.’ This work introduces solutions to create a unified framework by combining the adaptability of meta-learning with the task-sharing capabilities of MTL to promote effective MTL. The contributions of this thesis are organized in three dimensions. The first dimension focuses on combining MTLand meta-learning, resulting in the Multi-Task Meta-Learning (MTML) framework. The second dimension introduces structured sparsity to MTL, leading to the development of Layer-Optimized Multi-Task (LOMT) models, Structured Parameter Sparsity for Efficient Multi-Task Learning (SPARSE-MTL), and, finally, meta-sparsity. The third dimension investigates soft parameter sharing for multi-modal, multi-task feature alignment, enabling effective collaboration between different modalities.

The first major contribution (C1) is MTML, a framework that employs meta-learning to enable adaptive and efficient knowledge sharing across tasks in MTL. It uses a bi-level meta-optimization strategy to dynamically balance task-specific and shared knowledge; thereby allowing the network to learn faster and generalize better to unseen tasks while maintaining good performance across multiple tasks. This thesis further introduces the structured sparsity technique, particularly channel-wise group sparsity in multi-task settings, resulting in LOMT models (C2) and SPARSE-MTL (C3). Structured sparsity reduces redundant parameters in shared architectures of MTL, with the aim of preventing over-sharing by optimizing feature sharing across all the tasks. However, managing the sparsity level across tasks is a challenge, as the optimal degree of sparsification varies for different tasks and task combinations. To address this, meta-sparsity (C4), an extension of SPARSE-MTL and MTML, incorporates meta-learning to dynamically learn the optimal sparsity patterns across tasks. This ensures the efficient sharing of features while minimizing task interference.

In addition to the hard parameter sharing approaches, this thesis also explores soft parameter sharing for multi-modal data through a multi-task multi-modal feature alignment approach (C5). This method focuses on object detection tasks across RGB and infrared (IR) modalities. This approach aims to achieve effective knowledge sharing by employing channel-wise structured regularization to higher network layers to align semantic features between modalities and retain modality-specific features in lower layers. Therefore, it leverages the complementary strengths of RGB and IR data to enhance object detection performance across diverse conditions.

In summary, this thesis demonstrates how the integration of meta-learning and structured sparsity addresses fundamental challenges in MTL, resulting in more adaptable, efficient, and scalable systems. On a broader scale, this thesis also paves the way for parsimonious multi-task models, contributing to sustainable machine learning.

Place, publisher, year, edition, pages
Luleå: Luleå University of Technology, 2025
Series
Doctoral thesis / Luleå University of Technology 1 jan 1997 → …, ISSN 1402-1544
Keywords
Multi-task learning, Meta learning, Structured sparsity, Feature alignment
National Category
Computer Vision and Learning Systems Artificial Intelligence
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-111261 (URN)978-91-8048-730-6 (ISBN)978-91-8048-731-3 (ISBN)
Public defence
2025-03-10, A 117, Luleå University of Technology, Luleå, 09:00 (English)
Opponent
Supervisors
Available from: 2025-01-13 Created: 2025-01-11 Last updated: 2025-10-21Bibliographically approved
Saini, R., Upadhyay, R., Gupta, V., Chhipa, P. C., Rakesh, S., Mokayed, H., . . . Das Chakladar, D. (2024). An EEG Analysis Framework for Brain Disorder Classification Using Convolved Connectivity Features. In: 2024 9th International Conference on Frontiers of Signal Processing (ICFSP 2024): . Paper presented at 9th International Conference on Frontiers of Signal Processing (ICFSP 2024), Paris, France, September 12-14, 2024 (pp. 158-162). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>An EEG Analysis Framework for Brain Disorder Classification Using Convolved Connectivity Features
Show others...
2024 (English)In: 2024 9th International Conference on Frontiers of Signal Processing (ICFSP 2024), Institute of Electrical and Electronics Engineers Inc. , 2024, p. 158-162Conference paper, Published paper (Refereed)
Abstract [en]

Electroencephalography (EEG) is a fundamental tool in the non-invasive evaluation of brain activity, providing insights into the intricate dynamics at play within neurode-generative disorders. Conventional methodologies often lack in effectively capturing the temporal and intricate intra- and inter-channel dynamics, leading to diminished predictive accuracy. To address this problem, we present an innovative framework that effectively captures temporal along with intra- and inter-channel dynamics for EEG analysis aimed at predicting neu-rodegenerative disorders, explicitly targeting Alzheimer's and dementia. The proposed method involves constructing aggregated recurrence matrices from EEG channels followed by kernel formation and convolution operation, effectively encapsulating intra- and inter-channel spatiotemporal patterns, thereby achieving a more comprehensive representation of neural dynamics. The proposed approach was validated using public datasets, revealing competitive performance. Implementation details with codes will be accessible on GitHub.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2024
Keywords
Alzheimer’s, Dementia, Electroencephalography (EEG), Brian signals, Convolution, Machine Learning
National Category
Neurosciences Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-111522 (URN)10.1109/ICFSP62546.2024.10785421 (DOI)2-s2.0-85215675831 (Scopus ID)
Conference
9th International Conference on Frontiers of Signal Processing (ICFSP 2024), Paris, France, September 12-14, 2024
Funder
Promobilia foundation
Note

ISBN for host publication: 979-8-3503-5323-5

Available from: 2025-02-04 Created: 2025-02-04 Last updated: 2025-10-21Bibliographically approved
Upadhyay, R., Phlypo, R., Saini, R. & Liwicki, M. (2024). Less is More: Towards parsimonious multi-task models using structured sparsity. In: Yuejie Chi, Gintare Karolina Dziugaite, Qing Qu, Atlas Wang Wang, Zhihui Zhu (Ed.), Proceedings of Machine Learning Research, PMLR: . Paper presented at 1st Conference on Parsimony and Learning (CPAL 2024), Hongkong, China, January 3-6, 2024 (pp. 590-601). Proceedings of Machine Learning Research, 234
Open this publication in new window or tab >>Less is More: Towards parsimonious multi-task models using structured sparsity
2024 (English)In: Proceedings of Machine Learning Research, PMLR / [ed] Yuejie Chi, Gintare Karolina Dziugaite, Qing Qu, Atlas Wang Wang, Zhihui Zhu, Proceedings of Machine Learning Research , 2024, Vol. 234, p. 590-601Conference paper, Published paper (Refereed)
Abstract [en]

Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model’s memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two widely-used multi-task learning datasets: NYU-v2 and CelebAMask-HQ. On both datasets, which consist of three different computer vision tasks each, multi-task models with approximately 70% sparsity outperform their dense equivalents. We also investigate how changing the degree of sparsification influences the model’s performance, the overall sparsity percentage, the patterns of sparsity, and the inference time.

Place, publisher, year, edition, pages
Proceedings of Machine Learning Research, 2024
Keywords
Multi-task learning, structured sparsity, group sparsity, parameter pruning, semantic segmentation, depth estimation, surface normal estimation
National Category
Probability Theory and Statistics
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-103838 (URN)2-s2.0-85183883391 (Scopus ID)
Conference
1st Conference on Parsimony and Learning (CPAL 2024), Hongkong, China, January 3-6, 2024
Funder
Knut and Alice Wallenberg Foundation
Note

Copyright © The authors and PMLR 2024. Authors retain copyright.

Available from: 2024-01-29 Created: 2024-01-29 Last updated: 2025-10-21Bibliographically approved
Upadhyay, R., Phlypo, R., Saini, R. & Liwicki, M. (2024). Sharing to Learn and Learning to Share; Fitting Together Meta, Multi-Task, and Transfer Learning: A Meta Review. IEEE Access, 12, 148553-148576
Open this publication in new window or tab >>Sharing to Learn and Learning to Share; Fitting Together Meta, Multi-Task, and Transfer Learning: A Meta Review
2024 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 12, p. 148553-148576Article, review/survey (Refereed) Published
Abstract [en]

Integrating knowledge across different domains is an essential feature of human learning. Learning paradigms such as transfer learning, meta-learning, and multi-task learning reflect the human learning process by exploiting the prior knowledge for new tasks, encouraging faster learning and good generalization for new tasks. This article gives a detailed view of these learning paradigms and their comparative analysis. The weakness of one learning algorithm turns out to be a strength of another, and thus, merging them is a prevalent trait in the literature. Numerous research papers focus on each of these learning paradigms separately and provide a comprehensive overview of them. However, this article reviews research studies that combine (two of) these learning algorithms. This survey describes how these techniques are combined to solve problems in many different fields of research, including computer vision, natural language processing, hyper-spectral imaging, and many more, in a supervised setting only. Based on the knowledge accumulated from the literature, we hypothesize a generic task-agnostic and model-agnostic learning network – an ensemble of meta-learning, transfer learning, and multi-task learning, termed Multi-modal Multi-task Meta Transfer Learning. We also present some open research questions, limitations, and future research directions for this proposed network. The aim of this article is to spark interest among scholars in effectively merging existing learning algorithms with the intention of advancing research in this field. Instead of presenting experimental results, we invite readers to explore and contemplate techniques for merging algorithms while navigating through their limitations.

Place, publisher, year, edition, pages
IEEE, 2024
Keywords
Knowledge sharing, multi-task learning, meta-learning, multi-modal inputs, transfer learning
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-94817 (URN)10.48550/arXiv.2111.12146 (DOI)
Note

Validerad;2024;Nivå 2;2024-11-11 (joosat);

Full text license: CC BY

Available from: 2022-12-12 Created: 2022-12-12 Last updated: 2025-10-21Bibliographically approved
Mishra, A. R., Kumar, R., Gupta, V., Prabhu, S., Upadhyay, R., Chhipa, P. C., . . . Saini, R. (2024). SignEEG v1.0: Multimodal Dataset with Electroencephalography and Hand-written Signature for Biometric Systems. Scientific Data, 11, Article ID 718.
Open this publication in new window or tab >>SignEEG v1.0: Multimodal Dataset with Electroencephalography and Hand-written Signature for Biometric Systems
Show others...
2024 (English)In: Scientific Data, E-ISSN 2052-4463, Vol. 11, article id 718Article in journal (Refereed) Published
Abstract [en]

Handwritten signatures in biometric authentication leverage unique individual characteristics for identification, offering high specificity through dynamic and static properties. However, this modality faces significant challenges from sophisticated forgery attempts, underscoring the need for enhanced security measures in common applications. To address forgery in signature-based biometric systems, integrating a forgery-resistant modality, namely, noninvasive electroencephalography (EEG), which captures unique brain activity patterns, can significantly enhance system robustness by leveraging multimodality’s strengths. By combining EEG, a physiological modality, with handwritten signatures, a behavioral modality, our approach capitalizes on the strengths of both, significantly fortifying the robustness of biometric systems through this multimodal integration. In addition, EEG’s resistance to replication offers a high-security level, making it a robust addition to user identification and verification. This study presents a new multimodal SignEEG v1.0 dataset based on EEG and hand-drawn signatures from 70 subjects. EEG signals and hand-drawn signatures have been collected with Emotiv Insight and Wacom One sensors, respectively. The multimodal data consists of three paradigms based on mental, & motor imagery, and physical execution: i) thinking of the signature’s image, (ii) drawing the signature mentally, and (iii) drawing a signature physically. Extensive experiments have been conducted to establish a baseline with machine learning classifiers. The results demonstrate that multimodality in biometric systems significantly enhances robustness, achieving high reliability even with limited sample sizes. We release the raw, pre-processed data and easy-to-follow implementation details.

Place, publisher, year, edition, pages
Nature Research, 2024
National Category
Computer Sciences Signal Processing
Research subject
Operation and Maintenance Engineering; Machine Learning
Identifiers
urn:nbn:se:ltu:diva-108479 (URN)10.1038/s41597-024-03546-z (DOI)001261561300002 ()38956046 (PubMedID)2-s2.0-85197457964 (Scopus ID)
Note

Validerad;2024;Nivå 2;2024-08-07 (hanlid);

Full text license: CC BY

Available from: 2024-08-07 Created: 2024-08-07 Last updated: 2025-10-21Bibliographically approved
Rakesh, S., Liwicki, F., Mokayed, H., Upadhyay, R., Chhipa, P. C., Gupta, V., . . . Saini, R. (2023). Emotions Classification Using EEG in Health Care. In: Tistarelli, Massimo; Dubey, Shiv Ram; Singh, Satish Kumar; Jiang, Xiaoyi (Ed.), Computer Vision and Machine Intelligence: Proceedings of CVMI 2022. Paper presented at International Conference on Computer Vision & Machine Intelligence (CVMI), Allahabad, Prayagraj, India, August 12-13, 2022 (pp. 37-49). Springer Nature
Open this publication in new window or tab >>Emotions Classification Using EEG in Health Care
Show others...
2023 (English)In: Computer Vision and Machine Intelligence: Proceedings of CVMI 2022 / [ed] Tistarelli, Massimo; Dubey, Shiv Ram; Singh, Satish Kumar; Jiang, Xiaoyi, Springer Nature, 2023, p. 37-49Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Springer Nature, 2023
Series
Lecture Notes in Networks and Systems (LNNS) ; 586
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-98587 (URN)10.1007/978-981-19-7867-8_4 (DOI)2-s2.0-85161601282 (Scopus ID)
Conference
International Conference on Computer Vision & Machine Intelligence (CVMI), Allahabad, Prayagraj, India, August 12-13, 2022
Note

ISBN för värdpublikation: 978-981-19-7866-1, 978-981-19-7867-8

Available from: 2023-06-19 Created: 2023-06-19 Last updated: 2025-10-21Bibliographically approved
Chhipa, P. C., Chopra, M., Mengi, G., Gupta, V., Upadhyay, R., Chippa, M. S., . . . Liwicki, M. (2023). Functional Knowledge Transfer with Self-supervised Representation Learning. In: 2023 IEEE International Conference on Image Processing: Proceedings: . Paper presented at 30th IEEE International Conference on Image Processing, ICIP 2023, October 8-11, 2023, Kuala Lumpur, Malaysia (pp. 3339-3343). IEEE
Open this publication in new window or tab >>Functional Knowledge Transfer with Self-supervised Representation Learning
Show others...
2023 (English)In: 2023 IEEE International Conference on Image Processing: Proceedings, IEEE , 2023, p. 3339-3343Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
IEEE, 2023
Series
Proceedings - International Conference on Image Processing, ISSN 1522-4880
National Category
Computer graphics and computer vision Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-103659 (URN)10.1109/ICIP49359.2023.10222142 (DOI)001106821003077 ()2-s2.0-85180766253 (Scopus ID)978-1-7281-9835-4 (ISBN)978-1-7281-9836-1 (ISBN)
Conference
30th IEEE International Conference on Image Processing, ICIP 2023, October 8-11, 2023, Kuala Lumpur, Malaysia
Available from: 2024-01-15 Created: 2024-01-15 Last updated: 2025-10-21Bibliographically approved
Chhipa, P. C., Upadhyay, R., Grund Pihlgren, G., Saini, R., Uchida, S. & Liwicki, M. (2023). Magnification Prior: A Self-Supervised Method for Learning Representations on Breast Cancer Histopathological Images. In: Proceedings: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2023): . Paper presented at 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), January 2-7, 2023, Waikoloa, Hawaii, USA (pp. 2716-2726). IEEE
Open this publication in new window or tab >>Magnification Prior: A Self-Supervised Method for Learning Representations on Breast Cancer Histopathological Images
Show others...
2023 (English)In: Proceedings: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2023), IEEE, 2023, p. 2716-2726Conference paper, Published paper (Refereed)
Abstract [en]

This work presents a novel self-supervised pre-training method to learn efficient representations without labels on histopathology medical images utilizing magnification factors. Other state-of-the-art works mainly focus on fully supervised learning approaches that rely heavily on human annotations. However, the scarcity of labeled and unlabeled data is a long-standing challenge in histopathology. Currently, representation learning without labels remains unexplored in the histopathology domain. The proposed method, Magnification Prior Contrastive Similarity (MPCS), enables self-supervised learning of representations without labels on small-scale breast cancer dataset BreakHis by exploiting magnification factor, inductive transfer, and reducing human prior. The proposed method matches fully supervised learning state-of-the-art performance in malignancy classification when only 20% of labels are used in fine-tuning and outperform previous works in fully supervised learning settings for three public breast cancer datasets, including BreakHis. Further, It provides initial support for a hypothesis that reducing human-prior leads to efficient representation learning in self-supervision, which will need further investigation. The implementation of this work is available online on GitHub

Place, publisher, year, edition, pages
IEEE, 2023
Series
Proceedings IEEE Workshop on Applications of Computer Vision, ISSN 2472-6737, E-ISSN 2642-9381
Keywords
self-supervised learning, contrastive learning, representation learning, breast cancer, histopathological images, transfer learning, medical images
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-94845 (URN)10.1109/WACV56688.2023.00274 (DOI)000971500202081 ()2-s2.0-85149049398 (Scopus ID)978-1-6654-9346-8 (ISBN)
Conference
2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), January 2-7, 2023, Waikoloa, Hawaii, USA
Available from: 2022-12-14 Created: 2022-12-14 Last updated: 2025-10-21Bibliographically approved
Upadhyay, R., Chhipa, P. C., Phlypo, R., Saini, R. & Liwicki, M. (2023). Multi-Task Meta Learning: learn how to adapt to unseen tasks. In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings: . Paper presented at 2023 International Joint Conference on Neural Networks, IJCNN 2023, Gold Coast, Australia, June 18-23, 2023. Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Multi-Task Meta Learning: learn how to adapt to unseen tasks
Show others...
2023 (English)In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2023Conference paper, Published paper (Refereed)
Abstract [en]

This work proposes Multi-task Meta Learning (MTML), integrating two learning paradigms Multi-Task Learning (MTL) and meta learning, to bring together the best of both worlds. In particular, it focuses simultaneous learning of multiple tasks, an element of MTL and promptly adapting to new tasks, a quality of meta learning. It is important to highlight that we focus on heterogeneous tasks, which are of distinct kind, in contrast to typically considered homogeneous tasks (e.g., if all tasks are classification or if all tasks are regression tasks). The fundamental idea is to train a multi-task model, such that when an unseen task is introduced, it can learn in fewer steps whilst offering a performance at least as good as conventional single task learning on the new task or inclusion within the MTL. By conducting various experiments, we demonstrate this paradigm on two datasets and four tasks: NYU-v2 and the taskonomy dataset for which we perform semantic segmentation, depth estimation, surface normal estimation, and edge detection. MTML achieves state-of-the-art results for three out of four tasks for the NYU-v2 dataset and two out of four for the taskonomy dataset. In the taskonomy dataset, it was discovered that many pseudo-labeled segmentation masks lacked classes that were expected to be present in the ground truth; however, our MTML approach was found to be effective in detecting these missing classes, delivering good qualitative results. While, quantitatively its performance was affected due to the presence of incorrect ground truth labels. The the source code for reproducibility can be found at https://github.com/ricupa/MTML-learn-how-to-adapt-to-unseen-tasks.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2023
Series
Proceedings of the International Joint Conference on Neural Networks, ISSN 2161-4393, E-ISSN 2161-4407
Keywords
depth estimation, meta learning, Multi-task learning, semantic segmentation, surface normal estimation
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-101308 (URN)10.1109/IJCNN54540.2023.10191400 (DOI)001046198702103 ()2-s2.0-85169569118 (Scopus ID)978-1-6654-8868-6 (ISBN)978-1-6654-8867-9 (ISBN)
Conference
2023 International Joint Conference on Neural Networks, IJCNN 2023, Gold Coast, Australia, June 18-23, 2023
Available from: 2023-09-12 Created: 2023-09-12 Last updated: 2025-10-21Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-9604-7193

Search in DiVA

Show all publications