Change search
Link to record
Permanent link

Direct link
Alternative names
Publications (10 of 127) Show all publications
Rotari, M. & Kulahci, M. (2025). Correlation to causality. Quality Engineering, 37(1), 162-172
Open this publication in new window or tab >>Correlation to causality
2025 (English)In: Quality Engineering, ISSN 0898-2112, E-ISSN 1532-4222, Vol. 37, no 1, p. 162-172Article in journal (Refereed) Published
Place, publisher, year, edition, pages
Taylor & Francis, 2025
National Category
Economics
Research subject
Quality Technology and Logistics
Identifiers
urn:nbn:se:ltu:diva-108371 (URN)10.1080/08982112.2024.2372489 (DOI)001261070400001 ()2-s2.0-85197264916 (Scopus ID)
Note

Validerad;2025;Nivå 2;2025-02-24 (u2);

Available from: 2024-07-24 Created: 2024-07-24 Last updated: 2025-02-24Bibliographically approved
Centofanti, F., Kulahci, M., Lepore, A. & Spooner, M. P. (2025). Real-time monitoring of functional data. Journal of QualityTechnology
Open this publication in new window or tab >>Real-time monitoring of functional data
2025 (English)In: Journal of QualityTechnology, ISSN 0022-4065Article in journal (Refereed) Epub ahead of print
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:ltu:diva-111698 (URN)10.1080/00224065.2024.2430978 (DOI)001404822400001 ()2-s2.0-85216483660 (Scopus ID)
Available from: 2025-02-21 Created: 2025-02-21 Last updated: 2025-02-21
Cacciarelli, D. & Kulahci, M. (2025). Semi-supervised learning for predictive modeling in industrial applications. Quality Engineering
Open this publication in new window or tab >>Semi-supervised learning for predictive modeling in industrial applications
2025 (English)In: Quality Engineering, ISSN 0898-2112, E-ISSN 1532-4222Article in journal (Refereed) Epub ahead of print
Place, publisher, year, edition, pages
Taylor & Francis, 2025
National Category
Computer Sciences
Research subject
Quality Technology and Logistics
Identifiers
urn:nbn:se:ltu:diva-111278 (URN)10.1080/08982112.2024.2440371 (DOI)001389815500001 ()2-s2.0-85214142809 (Scopus ID)
Available from: 2025-01-13 Created: 2025-01-13 Last updated: 2025-01-16
Hviid Hansen, H., Külahci, M. & Friis Nielsen, B. (2024). A primer on predictive maintenance: Potential benefits and practical challenges. Quality Engineering, 36(3), 638-649
Open this publication in new window or tab >>A primer on predictive maintenance: Potential benefits and practical challenges
2024 (English)In: Quality Engineering, ISSN 0898-2112, E-ISSN 1532-4222, Vol. 36, no 3, p. 638-649Article in journal (Refereed) Published
Place, publisher, year, edition, pages
Taylor and Francis Ltd., 2024
National Category
Reliability and Maintenance
Research subject
Quality Technology and Logistics
Identifiers
urn:nbn:se:ltu:diva-105016 (URN)10.1080/08982112.2024.2331140 (DOI)001190577100001 ()2-s2.0-85189210018 (Scopus ID)
Note

Validerad;2024;Nivå 2;2024-07-01 (joosat);

Available from: 2024-04-08 Created: 2024-04-08 Last updated: 2024-07-01Bibliographically approved
Cacciarelli, D. & Kulahci, M. (2024). Active learning for data streams: a survey. Machine Learning, 113(1), 185-239
Open this publication in new window or tab >>Active learning for data streams: a survey
2024 (English)In: Machine Learning, ISSN 0885-6125, E-ISSN 1573-0565, Vol. 113, no 1, p. 185-239Article, review/survey (Refereed) Published
Abstract [en]

Online active learning is a paradigm in machine learning that aims to select the most informative data points to label from a data stream. The problem of minimizing the cost associated with collecting labeled observations has gained a lot of attention in recent years, particularly in real-world applications where data is only available in an unlabeled form. Annotating each observation can be time-consuming and costly, making it difficult to obtain large amounts of labeled data. To overcome this issue, many active learning strategies have been proposed in the last decades, aiming to select the most informative observations for labeling in order to improve the performance of machine learning models. These approaches can be broadly divided into two categories: static pool-based and stream-based active learning. Pool-based active learning involves selecting a subset of observations from a closed pool of unlabeled data, and it has been the focus of many surveys and literature reviews. However, the growing availability of data streams has led to an increase in the number of approaches that focus on online active learning, which involves continuously selecting and labeling observations as they arrive in a stream. This work aims to provide an overview of the most recently proposed approaches for selecting the most informative observations from data streams in real time. We review the various techniques that have been proposed and discuss their strengths and limitations, as well as the challenges and opportunities that exist in this area of research.

Place, publisher, year, edition, pages
Springer Nature, 2024
Keywords
Bandits, Concept drift, Data streams, Experimental design, Online active learning, Online learning, Query strategies, Selective sampling, Stream-based active learning, Unlabeled data
National Category
Computer Sciences
Research subject
Quality Technology and Logistics
Identifiers
urn:nbn:se:ltu:diva-103014 (URN)10.1007/s10994-023-06454-2 (DOI)001205212600007 ()2-s2.0-85177180685 (Scopus ID)
Note

Validerad;2024;Nivå 2;2024-04-02 (hanlid);

Funder: DTU Strategic Alliances Fund;

Full text license: CC BY 4.0

Available from: 2023-11-29 Created: 2023-11-29 Last updated: 2024-11-20Bibliographically approved
Cacciarelli, D. & Kulahci, M. (2024). Active learning for industrial applications. Quality Engineering
Open this publication in new window or tab >>Active learning for industrial applications
2024 (English)In: Quality Engineering, ISSN 0898-2112, E-ISSN 1532-4222Article in journal (Refereed) Epub ahead of print
Abstract [en]

Industrial data is often available only in an unlabeled form as obtaining the label (the response) for the input data can be a challenging and time-consuming task. This Quality Quandaries provides an overview of active learning-based sampling methods for streamlining the development of classification and regression models in label-scarce environments. A case study on active learning for vision-based industrial inspection is presented. The case study shows how selecting the most informative data points to label can at a fraction of the cost achieve model performances similar to the case where all input data is labeled.

Place, publisher, year, edition, pages
Taylor & Francis, 2024
Keywords
sampling strategies, semi-supervised learning, quality control, unlabeled data, unsupervised learning
National Category
Computer Sciences
Research subject
Quality Technology and Logistics
Identifiers
urn:nbn:se:ltu:diva-110152 (URN)10.1080/08982112.2024.2402376 (DOI)001315043600001 ()2-s2.0-85204228852 (Scopus ID)
Available from: 2024-10-01 Created: 2024-10-01 Last updated: 2024-12-05
Rotari, M., Diaz, V. F., De Ketelaere, B. & Kulahci, M. (2024). An extension of PARAFAC to analyze multi-group three-way data. Chemometrics and Intelligent Laboratory Systems, 246, Article ID 105089.
Open this publication in new window or tab >>An extension of PARAFAC to analyze multi-group three-way data
2024 (English)In: Chemometrics and Intelligent Laboratory Systems, ISSN 0169-7439, E-ISSN 1873-3239, Vol. 246, article id 105089Article in journal (Refereed) Published
Abstract [en]

This paper introduces a novel methodology for analyzing three-way array data with a multi-group structure. Three-way arrays are commonly observed in various domains, including image analysis, chemometrics, and real-world applications. In this paper, we use a practical case study of process modeling in additive manufacturing, where batches are structured according to multiple groups. Vast volumes of data for multiple variables and process stages are recorded by sensors installed on the production line for each batch. For these three-way arrays, the link between the final product and the observations creates a grouping structure in the observations. This grouping may hamper gaining insight into the process if only some of the groups dominate the controlled variability of the products. In this study, we develop an extension of the PARAFAC model that takes into account the grouping structure of three-way data sets. With this extension, it is possible to estimate a model that is representative of all the groups simultaneously by finding their common structure. The proposed model has been applied to three simulation data sets and a real manufacturing case study. The capability to find the common structure of the groups is compared to PARAFAC and the insights into the importance of variables delivered by the models are discussed.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
Additive manufacturing, Factor analysis, Multi-group data set, PARAFAC
National Category
Probability Theory and Statistics
Research subject
Quality Technology and Logistics
Identifiers
urn:nbn:se:ltu:diva-104469 (URN)10.1016/j.chemolab.2024.105089 (DOI)001197728600001 ()2-s2.0-85185833599 (Scopus ID)
Note

Validerad;2024;Nivå 2;2024-03-06 (hanlid);

Full text license: CC BY

Available from: 2024-03-06 Created: 2024-03-06 Last updated: 2024-11-20Bibliographically approved
Hansen, H. H., MacDougall, N., Jensen, C. D., Kulahci, M. & Nielsen, B. F. (2024). Condition monitoring of wind turbine faults: Modeling and savings. Applied Mathematical Modelling, 130, 160-174
Open this publication in new window or tab >>Condition monitoring of wind turbine faults: Modeling and savings
Show others...
2024 (English)In: Applied Mathematical Modelling, ISSN 0307-904X, E-ISSN 1872-8480, Vol. 130, p. 160-174Article in journal (Refereed) Published
Place, publisher, year, edition, pages
Elsevier Inc., 2024
National Category
Reliability and Maintenance
Research subject
Quality Technology and Logistics
Identifiers
urn:nbn:se:ltu:diva-104880 (URN)10.1016/j.apm.2024.02.036 (DOI)001236940300001 ()2-s2.0-85187227267 (Scopus ID)
Note

Validerad;2024;Nivå 2;2024-04-05 (marisr)

Available from: 2024-03-26 Created: 2024-03-26 Last updated: 2024-11-20Bibliographically approved
Cacciarelli, D., Kulahci, M. & Tyssedal, J. S. (2024). Robust online active learning. Quality and Reliability Engineering International, 40(1), 277-296
Open this publication in new window or tab >>Robust online active learning
2024 (English)In: Quality and Reliability Engineering International, ISSN 0748-8017, E-ISSN 1099-1638, Vol. 40, no 1, p. 277-296Article in journal (Refereed) Published
Abstract [en]

In many industrial applications, obtaining labeled observations is not straightforward as it often requires the intervention of human experts or the use of expensive testing equipment. In these circumstances, active learning can be highly beneficial in suggesting the most informative data points to be used when fitting a model. Reducing the number of observations needed for model development alleviates both the computational burden required for training and the operational expenses related to labeling. Online active learning, in particular, is useful in high-volume production processes where the decision about the acquisition of the label for a data point needs to be taken within an extremely short time frame. However, despite the recent efforts to develop online active learning strategies, the behavior of these methods in the presence of outliers has not been thoroughly examined. In this work, we investigate the performance of online active linear regression in contaminated data streams. Our study shows that the currently available query strategies are prone to sample outliers, whose inclusion in the training set eventually degrades the predictive performance of the models. To address this issue, we propose a solution that bounds the search area of a conditional D-optimal algorithm and uses a robust estimator. Our approach strikes a balance between exploring unseen regions of the input space and protecting against outliers. Through numerical simulations, we show that the proposed method is effective in improving the performance of online active learning in the presence of outliers, thus expanding the potential applications of this powerful tool.

Place, publisher, year, edition, pages
John Wiley & Sons, 2024
Keywords
active learning, data stream, optimal experimental design, outliers, robust regression, unlabeled data
National Category
Computer Sciences Computer graphics and computer vision
Research subject
Quality Technology and Logistics
Identifiers
urn:nbn:se:ltu:diva-98586 (URN)10.1002/qre.3392 (DOI)001002100700001 ()2-s2.0-85161536751 (Scopus ID)
Note

Validerad;2024;Nivå 2;2024-02-14 (sofila);

Funder: DTU Strategic Alliances Fund;

Full text license: CC BY 4.0

Available from: 2023-06-19 Created: 2023-06-19 Last updated: 2025-02-01Bibliographically approved
Rotari, M. & Kulahci, M. (2024). Variable selection wrapper in presence of correlated input variables for random forest models. Quality and Reliability Engineering International, 40(1), 297-312
Open this publication in new window or tab >>Variable selection wrapper in presence of correlated input variables for random forest models
2024 (English)In: Quality and Reliability Engineering International, ISSN 0748-8017, E-ISSN 1099-1638, Vol. 40, no 1, p. 297-312Article in journal (Refereed) Published
Abstract [en]

In most data analytic applications in manufacturing, understanding the data-driven models plays a crucial role in complementing the engineering knowledge about the production process. Identifying relevant input variables, rather than only predicting the response through some “black-box” model, is of great interest in many applications. There is, therefore, a growing focus on describing the contributions of the input variables to the model in the form of “variable importance”, which is readily available in certain machine learning methods such as random forest (RF). Once a ranking based on the importance measure of the variables is established, the question of how many variables are truly relevant in predicting the output variable rises. In this study, we focus on the Boruta algorithm, which is a wrapper around the RF model. It is a variable selection tool that assesses the variable importance measure for the RF model. It has been previously shown in the literature that the correlation among the input variables, which is often a common occurrence in high dimensional data, distorts and overestimates the importance of variables. The Boruta algorithm is also affected by this resulting in a larger set of input variables deemed important. To overcome this issue, in this study, we propose an extension of the Boruta algorithm for the correlated data by exploiting the conditional importance measure. This extension greatly improves the Boruta algorithm in the case of high correlation among variables and provides a more precise ranking of the variables that significantly contribute to the response. We believe this approach can be used in many industrial applications by providing more transparency and understanding of the process.

Place, publisher, year, edition, pages
John Wiley & Sons, 2024
Keywords
additive manufacturing, Boruta algorithm, conditional importance, random forest, variable selection algorithm
National Category
Computer Sciences
Research subject
Quality Technology and Logistics
Identifiers
urn:nbn:se:ltu:diva-99118 (URN)10.1002/qre.3398 (DOI)001009829700001 ()2-s2.0-85162025946 (Scopus ID)
Note

Validerad;2024;Nivå 2;2024-02-14 (sofila);

Full text license: CC BY-NC 4.0

Available from: 2023-07-03 Created: 2023-07-03 Last updated: 2024-02-14Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-4222-9631

Search in DiVA

Show all publications