Change search
Link to record
Permanent link

Direct link
Saini, Rajkumar, Dr.ORCID iD iconorcid.org/0000-0001-8532-0895
Publications (10 of 34) Show all publications
Upadhyay, R., Phlypo, R., Saini, R. & Liwicki, M. (2024). Less is More: Towards parsimonious multi-task models using structured sparsity. In: Yuejie Chi, Gintare Karolina Dziugaite, Qing Qu, Atlas Wang Wang, Zhihui Zhu (Ed.), Proceedings of Machine Learning Research, PMLR: . Paper presented at 1st Conference on Parsimony and Learning (CPAL 2024), Hongkong, China, January 3-6, 2024 (pp. 590-601). Proceedings of Machine Learning Research, 234
Open this publication in new window or tab >>Less is More: Towards parsimonious multi-task models using structured sparsity
2024 (English)In: Proceedings of Machine Learning Research, PMLR / [ed] Yuejie Chi, Gintare Karolina Dziugaite, Qing Qu, Atlas Wang Wang, Zhihui Zhu, Proceedings of Machine Learning Research , 2024, Vol. 234, p. 590-601Conference paper, Published paper (Refereed)
Abstract [en]

Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model’s memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two widely-used multi-task learning datasets: NYU-v2 and CelebAMask-HQ. On both datasets, which consist of three different computer vision tasks each, multi-task models with approximately 70% sparsity outperform their dense equivalents. We also investigate how changing the degree of sparsification influences the model’s performance, the overall sparsity percentage, the patterns of sparsity, and the inference time.

Place, publisher, year, edition, pages
Proceedings of Machine Learning Research, 2024
Keywords
Multi-task learning, structured sparsity, group sparsity, parameter pruning, semantic segmentation, depth estimation, surface normal estimation
National Category
Probability Theory and Statistics
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-103838 (URN)2-s2.0-85183883391 (Scopus ID)
Conference
1st Conference on Parsimony and Learning (CPAL 2024), Hongkong, China, January 3-6, 2024
Funder
Knut and Alice Wallenberg Foundation
Note

Copyright © The authors and PMLR 2024. Authors retain copyright.

Available from: 2024-01-29 Created: 2024-01-29 Last updated: 2024-02-19Bibliographically approved
Simistira Liwicki, F., Gupta, V., Saini, R., De, K., Abid, N., Rakesh, S., . . . Eriksson, J. (2023). Bimodal electroencephalography-functional magnetic resonance imaging dataset for inner-speech recognition. Scientific Data, 10, Article ID 378.
Open this publication in new window or tab >>Bimodal electroencephalography-functional magnetic resonance imaging dataset for inner-speech recognition
Show others...
2023 (English)In: Scientific Data, E-ISSN 2052-4463, Vol. 10, article id 378Article in journal (Refereed) Published
Abstract [en]

The recognition of inner speech, which could give a ‘voice’ to patients that have no ability to speak or move, is a challenge for brain-computer interfaces (BCIs). A shortcoming of the available datasets is that they do not combine modalities to increase the performance of inner speech recognition. Multimodal datasets of brain data enable the fusion of neuroimaging modalities with complimentary properties, such as the high spatial resolution of functional magnetic resonance imaging (fMRI) and the temporal resolution of electroencephalography (EEG), and therefore are promising for decoding inner speech. This paper presents the first publicly available bimodal dataset containing EEG and fMRI data acquired nonsimultaneously during inner-speech production. Data were obtained from four healthy, right-handed participants during an inner-speech task with words in either a social or numerical category. Each of the 8-word stimuli were assessed with 40 trials, resulting in 320 trials in each modality for each participant. The aim of this work is to provide a publicly available bimodal dataset on inner speech, contributing towards speech prostheses.

Place, publisher, year, edition, pages
Springer Nature, 2023
National Category
Computer Sciences Computer Vision and Robotics (Autonomous Systems)
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-98322 (URN)10.1038/s41597-023-02286-w (DOI)001006100600001 ()37311807 (PubMedID)2-s2.0-85161923014 (Scopus ID)
Note

Validerad;2023;Nivå 2;2023-06-13 (hanlid);

Funder: Grants for Excellent Research Projects Proposals of SRT.ai 2022

Available from: 2023-06-13 Created: 2023-06-13 Last updated: 2023-10-11Bibliographically approved
Chhipa, P. C., Rodahl Holmgren, J., De, K., Saini, R. & Liwicki, M. (2023). Can Self-Supervised Representation Learning Methods Withstand Distribution Shifts and Corruptions?. In: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023): . Paper presented at IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023), Paris, France, October 2-6, 2023 (pp. 4469-4478). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Can Self-Supervised Representation Learning Methods Withstand Distribution Shifts and Corruptions?
Show others...
2023 (English)In: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023), Institute of Electrical and Electronics Engineers Inc. , 2023, p. 4469-4478Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2023
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-103984 (URN)10.1109/ICCVW60793.2023.00481 (DOI)2-s2.0-85182928560 (Scopus ID)
Conference
IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023), Paris, France, October 2-6, 2023
Note

ISBN for host publication: 979-8-3503-0745-0;

Available from: 2024-01-29 Created: 2024-01-29 Last updated: 2024-01-29
Rakesh, S., Liwicki, F., Mokayed, H., Upadhyay, R., Chhipa, P. C., Gupta, V., . . . Saini, R. (2023). Emotions Classification Using EEG in Health Care. In: Tistarelli, Massimo; Dubey, Shiv Ram; Singh, Satish Kumar; Jiang, Xiaoyi (Ed.), Computer Vision and Machine Intelligence: Proceedings of CVMI 2022. Paper presented at International Conference on Computer Vision & Machine Intelligence (CVMI), Allahabad, Prayagraj, India, August 12-13, 2022 (pp. 37-49). Springer Nature
Open this publication in new window or tab >>Emotions Classification Using EEG in Health Care
Show others...
2023 (English)In: Computer Vision and Machine Intelligence: Proceedings of CVMI 2022 / [ed] Tistarelli, Massimo; Dubey, Shiv Ram; Singh, Satish Kumar; Jiang, Xiaoyi, Springer Nature, 2023, p. 37-49Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Springer Nature, 2023
Series
Lecture Notes in Networks and Systems (LNNS) ; 586
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-98587 (URN)10.1007/978-981-19-7867-8_4 (DOI)2-s2.0-85161601282 (Scopus ID)
Conference
International Conference on Computer Vision & Machine Intelligence (CVMI), Allahabad, Prayagraj, India, August 12-13, 2022
Note

ISBN för värdpublikation: 978-981-19-7866-1, 978-981-19-7867-8

Available from: 2023-06-19 Created: 2023-06-19 Last updated: 2023-09-05Bibliographically approved
Chhipa, P. C., Chopra, M., Mengi, G., Gupta, V., Upadhyay, R., Chippa, M. S., . . . Liwicki, M. (2023). Functional Knowledge Transfer with Self-supervised Representation Learning. In: 2023 IEEE International Conference on Image Processing: Proceedings: . Paper presented at 30th IEEE International Conference on Image Processing, ICIP 2023, October 8-11, 2023, Kuala Lumpur, Malaysia (pp. 3339-3343). IEEE
Open this publication in new window or tab >>Functional Knowledge Transfer with Self-supervised Representation Learning
Show others...
2023 (English)In: 2023 IEEE International Conference on Image Processing: Proceedings, IEEE , 2023, p. 3339-3343Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
IEEE, 2023
Series
Proceedings - International Conference on Image Processing, ISSN 1522-4880
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-103659 (URN)10.1109/ICIP49359.2023.10222142 (DOI)2-s2.0-85180766253 (Scopus ID)978-1-7281-9835-4 (ISBN)978-1-7281-9836-1 (ISBN)
Conference
30th IEEE International Conference on Image Processing, ICIP 2023, October 8-11, 2023, Kuala Lumpur, Malaysia
Available from: 2024-01-15 Created: 2024-01-15 Last updated: 2024-01-15Bibliographically approved
Xie, Y., Mouchère, H., Simistira Liwicki, F., Rakesh, S., Saini, R., Nakagawa, M., . . . Truong, T.-N. (2023). ICDAR 2023 CROHME: Competition on Recognition of Handwritten Mathematical Expressions. In: Gernot A. Fink, Rajiv Jain, Koichi Kise & Richard Zanibbi (Ed.), Document Analysis and Recognition - ICDAR 2023, Part II: . Paper presented at 17th International Conference on Document Analysis and Recognition ICDAR 2023), San José, CA, United States, August 21-26, 2023 (pp. 553-565). Springer
Open this publication in new window or tab >>ICDAR 2023 CROHME: Competition on Recognition of Handwritten Mathematical Expressions
Show others...
2023 (English)In: Document Analysis and Recognition - ICDAR 2023, Part II / [ed] Gernot A. Fink, Rajiv Jain, Koichi Kise & Richard Zanibbi, Springer, 2023, p. 553-565Conference paper, Published paper (Refereed)
Abstract [en]

This paper overviews the 7th edition of the Competition on Recognition of Handwritten Mathematical Expressions. ICDAR 2023 CROHME proposes three tasks with three different modalities: on-line, off-line and bimodal. 3905 new handwritten equations have been collected to propose new training, validation and test sets for the two modalities. The complete training set includes previous CROHME training set extented with complementary off-line (from OffRaSHME competition) and on-line samples (generated). The evaluation is conducted using the same protocol as the previous CROHME, allowing a fair comparison with previous results. This competition allows for the first time the comparison of the on-line and off-line systems on the same test set. Six participating teams have been evaluated. Finally the same team won all 3 tasks with more than 80% of expression recognition rate.

Place, publisher, year, edition, pages
Springer, 2023
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 14188
Keywords
bimodal, dataset, evaluation, handwriting recognition, mathematical expression recognition
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-103376 (URN)10.1007/978-3-031-41679-8_33 (DOI)2-s2.0-85173579550 (Scopus ID)
Conference
17th International Conference on Document Analysis and Recognition ICDAR 2023), San José, CA, United States, August 21-26, 2023
Note

ISBN for host publication: 978-3-031-41678-1 (print), 978-3-031-41679-8 (electronic)

Available from: 2024-01-03 Created: 2024-01-03 Last updated: 2024-01-03Bibliographically approved
Chhipa, P. C., Upadhyay, R., Grund Pihlgren, G., Saini, R., Uchida, S. & Liwicki, M. (2023). Magnification Prior: A Self-Supervised Method for Learning Representations on Breast Cancer Histopathological Images. In: Proceedings: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2023): . Paper presented at 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), January 2-7, 2023, Waikoloa, Hawaii, USA (pp. 2716-2726). IEEE
Open this publication in new window or tab >>Magnification Prior: A Self-Supervised Method for Learning Representations on Breast Cancer Histopathological Images
Show others...
2023 (English)In: Proceedings: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2023), IEEE, 2023, p. 2716-2726Conference paper, Published paper (Refereed)
Abstract [en]

This work presents a novel self-supervised pre-training method to learn efficient representations without labels on histopathology medical images utilizing magnification factors. Other state-of-the-art works mainly focus on fully supervised learning approaches that rely heavily on human annotations. However, the scarcity of labeled and unlabeled data is a long-standing challenge in histopathology. Currently, representation learning without labels remains unexplored in the histopathology domain. The proposed method, Magnification Prior Contrastive Similarity (MPCS), enables self-supervised learning of representations without labels on small-scale breast cancer dataset BreakHis by exploiting magnification factor, inductive transfer, and reducing human prior. The proposed method matches fully supervised learning state-of-the-art performance in malignancy classification when only 20% of labels are used in fine-tuning and outperform previous works in fully supervised learning settings for three public breast cancer datasets, including BreakHis. Further, It provides initial support for a hypothesis that reducing human-prior leads to efficient representation learning in self-supervision, which will need further investigation. The implementation of this work is available online on GitHub

Place, publisher, year, edition, pages
IEEE, 2023
Series
Proceedings IEEE Workshop on Applications of Computer Vision, ISSN 2472-6737, E-ISSN 2642-9381
Keywords
self-supervised learning, contrastive learning, representation learning, breast cancer, histopathological images, transfer learning, medical images
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-94845 (URN)10.1109/WACV56688.2023.00274 (DOI)2-s2.0-85149049398 (Scopus ID)978-1-6654-9346-8 (ISBN)
Conference
2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), January 2-7, 2023, Waikoloa, Hawaii, USA
Available from: 2022-12-14 Created: 2022-12-14 Last updated: 2023-10-11Bibliographically approved
Upadhyay, R., Chhipa, P. C., Phlypo, R., Saini, R. & Liwicki, M. (2023). Multi-Task Meta Learning: learn how to adapt to unseen tasks. In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings: . Paper presented at 2023 International Joint Conference on Neural Networks, IJCNN 2023, Gold Coast, Australia, June 18-23, 2023. Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Multi-Task Meta Learning: learn how to adapt to unseen tasks
Show others...
2023 (English)In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2023Conference paper, Published paper (Refereed)
Abstract [en]

This work proposes Multi-task Meta Learning (MTML), integrating two learning paradigms Multi-Task Learning (MTL) and meta learning, to bring together the best of both worlds. In particular, it focuses simultaneous learning of multiple tasks, an element of MTL and promptly adapting to new tasks, a quality of meta learning. It is important to highlight that we focus on heterogeneous tasks, which are of distinct kind, in contrast to typically considered homogeneous tasks (e.g., if all tasks are classification or if all tasks are regression tasks). The fundamental idea is to train a multi-task model, such that when an unseen task is introduced, it can learn in fewer steps whilst offering a performance at least as good as conventional single task learning on the new task or inclusion within the MTL. By conducting various experiments, we demonstrate this paradigm on two datasets and four tasks: NYU-v2 and the taskonomy dataset for which we perform semantic segmentation, depth estimation, surface normal estimation, and edge detection. MTML achieves state-of-the-art results for three out of four tasks for the NYU-v2 dataset and two out of four for the taskonomy dataset. In the taskonomy dataset, it was discovered that many pseudo-labeled segmentation masks lacked classes that were expected to be present in the ground truth; however, our MTML approach was found to be effective in detecting these missing classes, delivering good qualitative results. While, quantitatively its performance was affected due to the presence of incorrect ground truth labels. The the source code for reproducibility can be found at https://github.com/ricupa/MTML-learn-how-to-adapt-to-unseen-tasks.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2023
Series
Proceedings of the International Joint Conference on Neural Networks, ISSN 2161-4393, E-ISSN 2161-4407
Keywords
depth estimation, meta learning, Multi-task learning, semantic segmentation, surface normal estimation
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-101308 (URN)10.1109/IJCNN54540.2023.10191400 (DOI)2-s2.0-85169569118 (Scopus ID)978-1-6654-8868-6 (ISBN)978-1-6654-8867-9 (ISBN)
Conference
2023 International Joint Conference on Neural Networks, IJCNN 2023, Gold Coast, Australia, June 18-23, 2023
Available from: 2023-09-12 Created: 2023-09-12 Last updated: 2023-09-12Bibliographically approved
Mishra, A. R., Kumar, R., Gupta, V., Prabhu, S., Upadhyay, R., Chhipa, P. C., . . . Saini, R. (2023). SignEEG v1.0 : Multimodal Electroencephalography and Signature Database for Biometric Systems.
Open this publication in new window or tab >>SignEEG v1.0 : Multimodal Electroencephalography and Signature Database for Biometric Systems
Show others...
2023 (English)Manuscript (preprint) (Other academic)
National Category
Human Computer Interaction Computer Engineering
Research subject
Machine Learning; Operation and Maintenance Engineering
Identifiers
urn:nbn:se:ltu:diva-101424 (URN)10.1101/2023.09.09.556960 (DOI)
Available from: 2023-09-24 Created: 2023-09-24 Last updated: 2023-10-02
Shirkhani, S., Mokayed, H., Saini, R. & Chai, H. Y. (2023). Study of AI-Driven Fashion Recommender Systems. SN Computer Science, 4(5), Article ID 514.
Open this publication in new window or tab >>Study of AI-Driven Fashion Recommender Systems
2023 (English)In: SN Computer Science, ISSN 2662-995X, Vol. 4, no 5, article id 514Article in journal (Refereed) Published
Abstract [en]

The rising diversity, volume, and pace of fashion manufacturing pose a considerable challenge in the fashion industry, making it difficult for customers to pick which product to purchase. In addition, fashion is an inherently subjective, cultural notion and an ensemble of clothing items that maintains a coherent style. In most of the domains in which Recommender Systems are developed (e.g., movies, e-commerce, etc.), the similarity evaluation is considered for recommendation. Instead, in the Fashion domain, compatibility is a critical factor. In addition, raw visual features belonging to product representations that contribute to most of the algorithm’s performances in the Fashion domain are distinguishable from the metadata of the products in other domains. This literature review summarizes various Artificial Intelligence (AI) techniques that have lately been used in recommender systems for the fashion industry. AI enables higher-quality recommendations than earlier approaches. This has ushered in a new age for recommender systems, allowing for deeper insights into user-item relationships and representations and the discovery patterns in demographical, textual, virtual, and contextual data. This work seeks to give a deeper understanding of the fashion recommender system domain by performing a comprehensive literature study of research on this topic in the past 10 years, focusing on image-based fashion recommender systems taking AI improvements into account. The nuanced conceptions of this domain and their relevance have been developed to justify fashion domain-specific characteristics.

Place, publisher, year, edition, pages
Springer, 2023
Keywords
Artificial Intelligence (AI), Computer vision, Content-based image retrieval (CBIR), Deep learning, Fashion recommender system
National Category
Computer Sciences Information Systems
Research subject
Information Systems; Machine Learning
Identifiers
urn:nbn:se:ltu:diva-99430 (URN)10.1007/s42979-023-01932-9 (DOI)2-s2.0-85164519904 (Scopus ID)
Note

Validerad;2023;Nivå 1;2023-08-10 (joosat);

Licens fulltext: CC BY License

Available from: 2023-08-10 Created: 2023-08-10 Last updated: 2023-09-05Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-8532-0895

Search in DiVA

Show all publications