In industrial settings, collecting labeled data, i.e., input data with the corresponding output, is often expensive and time-consuming, while unlabeled process data is typically readily available in large quantities. This Quality Quandaries explores Semi-Supervised Learning (SSL) as a method to enhance predictive modeling by utilizing both labeled and unlabeled data. We provide a general discussion of SSL methods and their applicability in industrial environments, where efficient data utilization is critical. To demonstrate the practical utility of SSL, we present three cases: one focusing on regression, employing semi-supervised Autoencoders to extract meaningful features from unlabeled data, and two others on classification through the Label Spreading approach. These examples highlight the potential of SSL techniques to address data limitations and improve predictive performance in industrial applications.
Validerad;2025;Nivå 2;2025-05-30 (u5)