Open this publication in new window or tab >>Show others...
2023 (English)In: Proceedings of the Annual Conference of the PHM Society 2023 / [ed] Chetan S. Kulkarni; Indranil Roychoudhury, The Prognostics and Health Management Society , 2023Conference paper, Published paper (Refereed)
Abstract [en]
We propose a novel approach to facilitate supervised fault diagnosis on unlabelled but annotated industry datasets using human-centric technical language processing and weak supervision. Fault diagnosis through Condition Monitoring (CM) is vital for high safety and resource efficiency in the green transition and digital transformation of the process industry. Learning-based Intelligent Fault Diagnosis (IFD) methods are required to automate maintenance decisions and improve decision support for analysts. A major challenge is the lack of labelled industry datasets, limiting supervised IFD research to lab datasets. However, features learned from lab environments generalise poorly to field environments due to different signal distributions, artificial induction or acceleration of lab faults, and lab set-up properties such as average frequency profiles affecting learned features. In this study, we investigate how the unstructured free text fault annotations and maintenance work orders that are present in many industrial CM systems can be used for IFD through technical language processing, based on recent advances in natural language supervision. We introduce two distinct pipelines, one based on contrastive pre-training on large datasets, and one based on a small-data human-centric approach with unsupervised clustering methods. Finally, we showcase one example of the small-data fault classification implementation on a CM industry dataset with a SentenceBERT language model, kMeans clustering, and conventional signal processing methods. Fault class imbalance and time-shift uncertainty is overcome with weak supervision through aggregates of features, and human-centric clustering is used to integrate technical knowledge with the annotation-based fault classes. We show that our model can separate cable and sensor fault recordings from bearing-related fault recordings with an F1-score of 93. To our knowledge, this is the first system to classify faults in field industry CM data based only on associated unstructured fault annotations.
Place, publisher, year, edition, pages
The Prognostics and Health Management Society, 2023
Series
Annual Conference of the PHM Society (PHM), ISSN 2325-0178
Keywords
Intelligent Fault Diagnosis, Technical Language Processing, Natural Language Processing, Condition Monitoring, Technical Language Supervision, Natural Language Supervision, Prognostics and Health Management, Industry Data
National Category
Natural Language Processing
Research subject
Machine Learning; Cyber-Physical Systems
Identifiers
urn:nbn:se:ltu:diva-95406 (URN)10.36001/phmconf.2023.v15i1.3507 (DOI)2-s2.0-85178380051 (Scopus ID)
Conference
15th Annual Conference of the Prognostics and health Management Society, Salt Lake City, Utah, USA, October 28 - November 2, 2023
Projects
KnowIT FAST
Funder
VinnovaSwedish Research Council FormasSwedish Energy Agency
Note
Funder: Process industrial IT and Automation(PiIA) (2019-02533);
Full text license: CC BY;
This paper has previously appeared as a manuscript in a thesis;
ISBN for host publication: 978-1-936263-29-5
2023-01-272023-01-272025-04-09Bibliographically approved