Change search
Link to record
Permanent link

Direct link
Publications (10 of 133) Show all publications
Upadhyay, R., Phlypo, R., Saini, R. & Liwicki, M. (2024). Less is More: Towards parsimonious multi-task models using structured sparsity. In: Yuejie Chi, Gintare Karolina Dziugaite, Qing Qu, Atlas Wang Wang, Zhihui Zhu (Ed.), Proceedings of Machine Learning Research, PMLR: . Paper presented at 1st Conference on Parsimony and Learning (CPAL 2024), Hongkong, China, January 3-6, 2024 (pp. 590-601). Proceedings of Machine Learning Research, 234
Open this publication in new window or tab >>Less is More: Towards parsimonious multi-task models using structured sparsity
2024 (English)In: Proceedings of Machine Learning Research, PMLR / [ed] Yuejie Chi, Gintare Karolina Dziugaite, Qing Qu, Atlas Wang Wang, Zhihui Zhu, Proceedings of Machine Learning Research , 2024, Vol. 234, p. 590-601Conference paper, Published paper (Refereed)
Abstract [en]

Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model’s memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two widely-used multi-task learning datasets: NYU-v2 and CelebAMask-HQ. On both datasets, which consist of three different computer vision tasks each, multi-task models with approximately 70% sparsity outperform their dense equivalents. We also investigate how changing the degree of sparsification influences the model’s performance, the overall sparsity percentage, the patterns of sparsity, and the inference time.

Place, publisher, year, edition, pages
Proceedings of Machine Learning Research, 2024
Keywords
Multi-task learning, structured sparsity, group sparsity, parameter pruning, semantic segmentation, depth estimation, surface normal estimation
National Category
Probability Theory and Statistics
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-103838 (URN)2-s2.0-85183883391 (Scopus ID)
Conference
1st Conference on Parsimony and Learning (CPAL 2024), Hongkong, China, January 3-6, 2024
Funder
Knut and Alice Wallenberg Foundation
Note

Copyright © The authors and PMLR 2024. Authors retain copyright.

Available from: 2024-01-29 Created: 2024-01-29 Last updated: 2024-02-19Bibliographically approved
Khan, S. N., Usman, A., Afzal, M. S., Tanveer, M., Liwicki, M., Almqvist, A. & Park, C. W. (2024). Numerical investigation of thermomechanical behavior of Yttrium barium zirconate-coated aluminum alloy piston in an internal combustion engine. Applied Thermal Engineering, 236(part B), Article ID 121603.
Open this publication in new window or tab >>Numerical investigation of thermomechanical behavior of Yttrium barium zirconate-coated aluminum alloy piston in an internal combustion engine
Show others...
2024 (English)In: Applied Thermal Engineering, ISSN 1359-4311, E-ISSN 1873-5606, Vol. 236, no part B, article id 121603Article in journal (Refereed) Published
Abstract [en]

Increasing engine power to volume density is under investigation and being analysed extensively. Turbocharger, which is used to boost volumetric efficiency, also raises cylinder temperature and pressure, thus resulting in thermal distortions and reducing clearances in tribo-contacts, thereby compromising engine life. Thermal barrier coatings (TBCs) have shown potential to provide remedies to reduce heat losses, hazardous emissions, and heat flow toward the piston skirt in an internal combustion engine. In this study, a detailed thermo-mechanical analysis was performed for a diesel engine piston with a novel yttrium barium zirconate (YBZ) coating and then compared with other TBCs with varying thicknesses. The results revealed a notable decrease in piston substrate surface temperature when coated with various TBCs, with YBZ coating demonstrating superior performance over others. The 0.2 mm coating of YBZ-coated piston exhibited significant reductions of 15% and 10.3% in temperature and thermal stress respectively, thus enhancing piston durability. The better performance of the novel YBZ coating could be attributed to its stable thermal and elastic properties and lower thermal conductivity than other TBC materials. YBZ coating provides a promising solution to improve engine efficiency while extending engine life, making it an attractive option for the automotive industry.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
Diesel engine, Piston, Thermal barrier coating, Substrate surface temperature, Thermal stress
National Category
Manufacturing, Surface and Joining Technology Energy Engineering
Research subject
Machine Learning; Machine Elements
Identifiers
urn:nbn:se:ltu:diva-101384 (URN)10.1016/j.applthermaleng.2023.121603 (DOI)
Note

Validerad;2023;Nivå 2;2023-09-19 (joosat);

Funder: Korean government (No. 002086731G0003118)

Available from: 2023-09-19 Created: 2023-09-19 Last updated: 2023-09-19Bibliographically approved
Adewumi, T., Adeyemi, M., Anuoluwapo, A., Peters, B., Buzaaba, H., Samuel, O., . . . Liwicki, M. (2023). AfriWOZ: Corpus for Exploiting Cross-Lingual Transfer for Dialogue Generation in Low-Resource, African Languages. In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings: . Paper presented at 2023 International Joint Conference on Neural Networks, IJCNN 2023, Gold Coast, Australia, June 18-23, 2023. Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>AfriWOZ: Corpus for Exploiting Cross-Lingual Transfer for Dialogue Generation in Low-Resource, African Languages
Show others...
2023 (English)In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2023Conference paper, Published paper (Refereed)
Abstract [en]

Dialogue generation is an important NLP task fraught with many challenges. The challenges become more daunting for low-resource African languages. To enable the creation of dialogue agents for African languages, we contribute the first high-quality dialogue datasets for 6 African languages: Swahili, Wolof, Hausa, Nigerian Pidgin English, Kinyarwanda & Yorùbá. There are a total of 9,000 turns, each language having 1,500 turns, which we translate from a portion of the English multi-domain MultiWOZ dataset. Subsequently, we benchmark by investigating & analyzing the effectiveness of modelling through transfer learning by utilziing state-of-the-art (SoTA) deep monolingual models: DialoGPT and BlenderBot. We compare the models with a simple seq2seq baseline using perplexity. Besides this, we conduct human evaluation of single-turn conversations by using majority votes and measure inter-annotator agreement (IAA). We find that the hypothesis that deep monolingual models learn some abstractions that generalize across languages holds. We observe human-like conversations, to different degrees, in 5 out of the 6 languages. The language with the most transferable properties is the Nigerian Pidgin English, with a human-likeness score of 78.1%, of which 34.4% are unanimous. We freely provide the datasets and host the model checkpoints/demos on the HuggingFace hub for public access.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2023
Series
Proceedings of the International Joint Conference on Neural Networks, ISSN 2161-4393, E-ISSN 2161-4407
Keywords
crosslingual, dialogue systems, low-resource, multilingual, NLG
National Category
Language Technology (Computational Linguistics) Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-101305 (URN)10.1109/IJCNN54540.2023.10191208 (DOI)2-s2.0-85169561924 (Scopus ID)978-1-6654-8868-6 (ISBN)978-1-6654-8867-9 (ISBN)
Conference
2023 International Joint Conference on Neural Networks, IJCNN 2023, Gold Coast, Australia, June 18-23, 2023
Available from: 2023-09-12 Created: 2023-09-12 Last updated: 2023-09-12Bibliographically approved
Saad, A., Usman, A., Arif, S., Liwicki, M. & Almqvist, A. (2023). Bearing Fault Detection Scheme Using Machine Learning for Condition Monitoring Applications. In: Proceedings of the International Conference on Mechanical, Automotive and Mechatronics Engineering (ICMAME 2023): . Paper presented at International Conference on Mechanical, Automotive and Mechatronics Engineering (ICMAME 2023), 29-30 April 2023, Dubai, UAE. ICMAME, Article ID 137.
Open this publication in new window or tab >>Bearing Fault Detection Scheme Using Machine Learning for Condition Monitoring Applications
Show others...
2023 (English)In: Proceedings of the International Conference on Mechanical, Automotive and Mechatronics Engineering (ICMAME 2023), ICMAME , 2023, article id 137Conference paper, Published paper (Other academic)
Abstract [en]

Bearings are the significant components among the rolling machine elements subjected to high wear and tear. The timely detection of faults in such components rotating at higher frequencies can save substantial maintenance costs and production setbacks. Physical examination and fault detection by human experts is always challenging at runtime. Predictive maintenance and real-time condition monitoring are gaining higher utility with the advent of suitable instrumentation and machine learning classifiers. A convolutional neural network (CNN) based bearing fault detection scheme is developed in this research work. The acquired sensory data of vibration signals are converted into the frequency domain and then fed to the classifier for spectral feature extraction and fault classification. The CNN architecture is trained and tested using a bearing dataset available online. The model is further tested and validated with the data acquired from an indigenously designed bearing test rig. The proposed scheme has successfully detected inner and outer race faults and no fault or normal state. This multiclass fault classification has shown promising results with 97.68% accuracy, 96.9% precision, 99.14% sensitivity, 98.01% F1-score, and 93.65% specificity. The achieved results validate the utility of the proposed detection system. Hence the presented scheme has deployment potential for real-time condition monitoring and predictive maintenance applications.

Place, publisher, year, edition, pages
ICMAME, 2023
National Category
Other Civil Engineering
Research subject
Machine Learning; Machine Elements
Identifiers
urn:nbn:se:ltu:diva-101168 (URN)10.53375/icmame.2023.137 (DOI)978-625-00-1526-1 (ISBN)
Conference
International Conference on Mechanical, Automotive and Mechatronics Engineering (ICMAME 2023), 29-30 April 2023, Dubai, UAE
Funder
The Kempe FoundationsSwedish Research Council, DNR 2019-04293
Available from: 2023-09-04 Created: 2023-09-04 Last updated: 2023-09-04Bibliographically approved
Simistira Liwicki, F., Gupta, V., Saini, R., De, K., Abid, N., Rakesh, S., . . . Eriksson, J. (2023). Bimodal electroencephalography-functional magnetic resonance imaging dataset for inner-speech recognition. Scientific Data, 10, Article ID 378.
Open this publication in new window or tab >>Bimodal electroencephalography-functional magnetic resonance imaging dataset for inner-speech recognition
Show others...
2023 (English)In: Scientific Data, E-ISSN 2052-4463, Vol. 10, article id 378Article in journal (Refereed) Published
Abstract [en]

The recognition of inner speech, which could give a ‘voice’ to patients that have no ability to speak or move, is a challenge for brain-computer interfaces (BCIs). A shortcoming of the available datasets is that they do not combine modalities to increase the performance of inner speech recognition. Multimodal datasets of brain data enable the fusion of neuroimaging modalities with complimentary properties, such as the high spatial resolution of functional magnetic resonance imaging (fMRI) and the temporal resolution of electroencephalography (EEG), and therefore are promising for decoding inner speech. This paper presents the first publicly available bimodal dataset containing EEG and fMRI data acquired nonsimultaneously during inner-speech production. Data were obtained from four healthy, right-handed participants during an inner-speech task with words in either a social or numerical category. Each of the 8-word stimuli were assessed with 40 trials, resulting in 320 trials in each modality for each participant. The aim of this work is to provide a publicly available bimodal dataset on inner speech, contributing towards speech prostheses.

Place, publisher, year, edition, pages
Springer Nature, 2023
National Category
Computer Sciences Computer Vision and Robotics (Autonomous Systems)
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-98322 (URN)10.1038/s41597-023-02286-w (DOI)001006100600001 ()37311807 (PubMedID)2-s2.0-85161923014 (Scopus ID)
Note

Validerad;2023;Nivå 2;2023-06-13 (hanlid);

Funder: Grants for Excellent Research Projects Proposals of SRT.ai 2022

Available from: 2023-06-13 Created: 2023-06-13 Last updated: 2023-10-11Bibliographically approved
Adewumi, T., Södergren, I., Alkhaled, L., Sabry, S. S., Liwicki, F. & Liwicki, M. (2023). Bipol: Multi-axes Evaluation of Bias with Explainability in BenchmarkDatasets. In: Galia Angelova, Maria Kunilovskaya and Ruslan Mitkov (Ed.), Proceedings of Recent Advances in Natural Language Processing: . Paper presented at International Conference Recent Advances In Natural Language Processing (RANLP 2023), Varna, Bulgaria, September 4-6, 2023 (pp. 1-10). Incoma Ltd.
Open this publication in new window or tab >>Bipol: Multi-axes Evaluation of Bias with Explainability in BenchmarkDatasets
Show others...
2023 (English)In: Proceedings of Recent Advances in Natural Language Processing / [ed] Galia Angelova, Maria Kunilovskaya and Ruslan Mitkov, Incoma Ltd. , 2023, p. 1-10Conference paper, Published paper (Refereed)
Abstract [en]

We investigate five English NLP benchmark datasets (on the superGLUE leaderboard) and two Swedish datasets for bias, along multiple axes. The datasets are the following: Boolean Question (Boolq), CommitmentBank (CB), Winograd Schema Challenge (WSC), Winogender diagnostic (AXg), Recognising Textual Entailment (RTE), Swedish CB, and SWEDN. Bias can be harmful and it is known to be common in data, which ML models learn from. In order to mitigate bias in data, it is crucial to be able to estimate it objectively. We use bipol, a novel multi-axes bias metric with explainability, to estimate and explain how much bias exists in these datasets. Multilingual, multi-axes bias evaluation is not very common. Hence, we also contribute a new, large Swedish bias-labeled dataset (of 2 million samples), translated from the English version and train the SotA mT5 model on it. In addition, we contribute new multi-axes lexica for bias detection in Swedish. We make the codes, model, and new dataset publicly available.

Place, publisher, year, edition, pages
Incoma Ltd., 2023
Series
International conference Recent advances in natural language processing, E-ISSN 2603-2813 ; 2023
National Category
Language Technology (Computational Linguistics)
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-103097 (URN)10.26615/978-954-452-092-2_001 (DOI)
Conference
International Conference Recent Advances In Natural Language Processing (RANLP 2023), Varna, Bulgaria, September 4-6, 2023
Note

ISBN for host publication: 978-954-452-092-2

Available from: 2023-11-30 Created: 2023-11-30 Last updated: 2023-12-01Bibliographically approved
Chhipa, P. C., Rodahl Holmgren, J., De, K., Saini, R. & Liwicki, M. (2023). Can Self-Supervised Representation Learning Methods Withstand Distribution Shifts and Corruptions?. In: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023): . Paper presented at IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023), Paris, France, October 2-6, 2023 (pp. 4469-4478). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Can Self-Supervised Representation Learning Methods Withstand Distribution Shifts and Corruptions?
Show others...
2023 (English)In: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023), Institute of Electrical and Electronics Engineers Inc. , 2023, p. 4469-4478Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2023
National Category
Computer Vision and Robotics (Autonomous Systems) Computer Sciences
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-103984 (URN)10.1109/ICCVW60793.2023.00481 (DOI)2-s2.0-85182928560 (Scopus ID)
Conference
IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023), Paris, France, October 2-6, 2023
Note

ISBN for host publication: 979-8-3503-0745-0;

Available from: 2024-01-29 Created: 2024-01-29 Last updated: 2024-01-29
Löwenmark, K., Sandin, F., Liwicki, M. & Schnabel, S. (2023). Dataset with condition monitoring vibration data annotated with technical language, from paper machine industries in northern Sweden. Svensk Nationell Datatjänst (SND)
Open this publication in new window or tab >>Dataset with condition monitoring vibration data annotated with technical language, from paper machine industries in northern Sweden
2023 (English)Data set, Primary data
Alternative title[sv]
Dataset med tillståndsövervakningsvibrationsdata annoterat med tekniskt språk, från pappersmaskinsindustri i norra Sverige
Abstract [en]

Labelled industry datasets are one of the most valuable assets in prognostics and health management (PHM) research. However, creating labelled industry datasets is both difficult and expensive, making publicly available industry datasets rare at best, in particular labelled datasets.Recent studies have showcased that industry annotations can be used to train artificial intelligence models directly on industry data ( https://doi.org/10.36001/ijphm.2022.v13i2.3137 , https://doi.org/10.36001/phmconf.2023.v15i1.3507 ), but while many industry datasets also contain text descriptions or logbooks in the form of annotations and maintenance work orders, few, if any, are publicly available.Therefore, we release a dataset consisting with annotated signal data from two large (80mx10mx10m) paper machines, from a Kraftliner production company in northern Sweden. The data consists of 21 090 pairs of signals and annotations from one year of production. The annotations are written in Swedish, by on-site Swedish experts, and the signals consist primarily of accelerometer vibration measurements from the two machines.The dataset is structured as a Pandas dataframe and serialized as a pickle (.pkl) file and a JSON (.json) file. The first column (‘id’) is the ID of the samples; the second column (‘Spectra’) are the fast Fourier transform and envelope-transformed vibration signals; the third column (‘Notes’) are the associated annotations, mapped so that each annotation is associated with all signals from ten days before the annotation date, up to the annotation date; and finally the fourth column (‘Embeddings’) are pre-computed embeddings using Swedish SentenceBERT. Each row corresponds to a vibration measurement sample, though there is no distinction in this data between which sensor or machine part each measurement is from.

Abstract [sv]

Industridataset med labels är bland de mest värdefulla tillgångarna att tillgå inom prognostik- och tillståndsövervaknings-forskning. Att tillverka labellade dataset är både svårt och dyrt, vilket medför att allmänt tillgängliga industridataset är sällsynta, särskilt de med labels. Studier har dock visat att industriannoteringar kan användas för att träna AI-modeller direkt på industridata ( https://doi.org/10.36001/ijphm.2022.v13i2.3137 , https://doi.org/10.36001/phmconf.2023.v15i1.3507 ), men trots att många industridataset innehåller de nödvändiga texterna så är få, om ens några, sådana dataset allmänt tillgängliga.Därför ger vi ut ett dataset innehållandes annoterade signaldata från två stora (80x10x10m) pappersmaskiner från ett pappersbruk i norra Sverige. Datan består av 21 090 par av signaler och annoteringar från ett års produktion. Annoteringarna är skrivna på svenska av experter på plats, och signalerna består huvudsakligen av accelerometervibrationsmätningar från de två maskinerna.Datasetet består av ett års annoterade vibrationsensormätningar från två pappersmaskiner, strukturerade som en Pandas dataframe och serialiserade som en pickle-fil (.pkl) samt en JSON-fil (.json). Den första kolumnen (’id’) är ID per sample; den andra kolumnen (’Spectra’) är fast-Fourier-transformerade och envelope-transformerade vibrationssignaler; den tredje kolumnen (’Notes’) är de tillhörande annoteringarna, kartlagda så att varje annotering är kopplad till alla signaler från tio dagar före annoteringsdatumet upp till annoteringsdatumet; och slutligen den fjärde kolumnen (’Embeddings’) är förberäknade text-representationer från Swedish SentenceBERT. Varje rad motsvarar ett vibrationsmätningsprov, även om det inte finns någon åtskillnad i denna data mellan vilken sensor och maskindel varje mätning kommer från.

Place, publisher, year
Svensk Nationell Datatjänst (SND), 2023
Keywords
Paper industry, Condition monitoring, Language technology, Signal processing, Fault detection, Natural language processing, Technical language processing, Technical language supervision, Natural language supervision, Fault diagnosis, Intelligent fault diagnosis, Prognostics and health management
National Category
Language Technology (Computational Linguistics) Computer Sciences
Research subject
Machine Learning; Cyber-Physical Systems
Identifiers
urn:nbn:se:ltu:diva-103146 (URN)10.5878/z34p-qj52 (DOI)
Funder
Vinnova, 2019-02533
Note

CC BY-NC 4.0 

Available from: 2023-12-01 Created: 2023-12-01 Last updated: 2023-12-20Bibliographically approved
Javed, S., Usman, M., Sandin, F., Liwicki, M. & Mokayed, H. (2023). Deep Ontology Alignment Using a Natural Language Processing Approach for Automatic M2M Translation in IIoT. Sensors, 23(20), Article ID 8427.
Open this publication in new window or tab >>Deep Ontology Alignment Using a Natural Language Processing Approach for Automatic M2M Translation in IIoT
Show others...
2023 (English)In: Sensors, E-ISSN 1424-8220, Vol. 23, no 20, article id 8427Article in journal (Refereed) Published
Abstract [en]

The technical capabilities of modern Industry 4.0 and Industry 5.0 are vast and growing exponentially daily. The present-day Industrial Internet of Things (IIoT) combines manifold underlying technologies that require real-time interconnection and communication among heterogeneous devices. Smart cities are established with sophisticated designs and control of seamless machine-to-machine (M2M) communication, to optimize resources, costs, performance, and energy distributions. All the sensory devices within a building interact to maintain a sustainable climate for residents and intuitively optimize the energy distribution to optimize energy production. However, this encompasses quite a few challenges for devices that lack a compatible and interoperable design. The conventional solutions are restricted to limited domains or rely on engineers designing and deploying translators for each pair of ontologies. This is a costly process in terms of engineering effort and computational resources. An issue persists that a new device with a different ontology must be integrated into an existing IoT network. We propose a self-learning model that can determine the taxonomy of devices given their ontological meta-data and structural information. The model finds matches between two distinct ontologies using a natural language processing (NLP) approach to learn linguistic contexts. Then, by visualizing the ontological network as a knowledge graph, it is possible to learn the structure of the meta-data and understand the device's message formulation. Finally, the model can align entities of ontological graphs that are similar in context and structure.Furthermore, the model performs dynamic M2M translation without requiring extra engineering or hardware resources.

Place, publisher, year, edition, pages
MDPI, 2023
Keywords
deep learning, industrial internet of things, Industry 4.0, Industry 5.0 IIoT, knowledge graph, M2M translation, ontology alignment, self-attention, smart city
National Category
Computer Sciences Communication Systems
Research subject
Machine Learning; Cyber-Physical Systems
Identifiers
urn:nbn:se:ltu:diva-102316 (URN)10.3390/s23208427 (DOI)001095200100001 ()37896522 (PubMedID)2-s2.0-85175279210 (Scopus ID)
Note

Validerad;2023;Nivå 2;2023-11-14 (marisr);

License fulltext: CC BY

Available from: 2023-11-06 Created: 2023-11-06 Last updated: 2023-12-12Bibliographically approved
Chopra, M., Chhipa, P. C., Mengi, G., Gupta, V. & Liwicki, M. (2023). Domain Adaptable Self-supervised Representation Learning on Remote Sensing Satellite Imagery. In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings: . Paper presented at 2023 International Joint Conference on Neural Networks, IJCNN 2023, Gold Coast, Australia, June 18-23, 2023. Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Domain Adaptable Self-supervised Representation Learning on Remote Sensing Satellite Imagery
Show others...
2023 (English)In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2023Conference paper, Published paper (Refereed)
Abstract [en]

This work presents a novel domain adaption paradigm for studying contrastive self-supervised representation learning and knowledge transfer using remote sensing satellite data. Major state-of-the-art remote sensing visual domain ef-forts primarily focus on fully supervised learning approaches that rely entirely on human annotations. On the other hand, human annotations in remote sensing satellite imagery are always subject to limited quantity due to high costs and domain expertise, making transfer learning a viable alternative. The proposed approach investigates the knowledge transfer of self-supervised representations across the distinct source and target data distributions in depth in the remote sensing data domain. In this arrangement, self-supervised contrastive learning- based pretraining is performed on the source dataset, and downstream tasks are performed on the target datasets in a round-robin fashion. Experiments are conducted on three publicly avail-able datasets, UC Merced Landuse (UCMD), SIRI-WHU, and MLRSNet, for different downstream classification tasks versus label efficiency. In self-supervised knowledge transfer, the pro-posed approach achieves state-of-the-art performance with label efficiency labels and outperforms a fully supervised setting. A more in-depth qualitative examination reveals consistent evidence for explainable representation learning. The source code and trained models are published on GitHub1.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2023
Series
Proceedings of the International Joint Conference on Neural Networks, ISSN 2161-4393, E-ISSN 2161-4407
Keywords
contrastive learning, domain adaptation, remote sensing, representation learning, satellite image, self-supervised learning
National Category
Computer Sciences Signal Processing
Research subject
Machine Learning
Identifiers
urn:nbn:se:ltu:diva-101307 (URN)10.1109/IJCNN54540.2023.10191249 (DOI)2-s2.0-85169612572 (Scopus ID)978-1-6654-8868-6 (ISBN)978-1-6654-8867-9 (ISBN)
Conference
2023 International Joint Conference on Neural Networks, IJCNN 2023, Gold Coast, Australia, June 18-23, 2023
Available from: 2023-09-12 Created: 2023-09-12 Last updated: 2023-09-12Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-4029-6574

Search in DiVA

Show all publications