Open this publication in new window or tab >>2026 (English)In: Engineering Research Express, E-ISSN 2631-8695, Vol. 8, no 7, article id 075224Article in journal (Refereed) Published
Abstract [en]
Predictive reliability, or the ability to anticipate the failure probability of a system or component, is essential to extend the useful life of assets, prevent breakdowns, and minimize industrial incidents. Centrifugal pumps are among the most important assets in oil and gas applications, and their reliability is critical to operational availability and safety. As a result, predictive reliability strategies have increasingly been applied to these systems in recent years to enhance performance and reduce unplanned failures. Many studies have investigated the failure mechanisms of centrifugal pumps, and several time-to-failure predictive models have been developed, but the relative impact of the variables affecting their expected life is rarely quantified. Different methods often have significant variability in their results, due to method definitions, model behavior, and data-related factors like collinearity, sampling noise, and interactions. This paper addresses the issue by quantifying the influence of key variables through the application of four different methods. Three are based on statistical techniques: partial Likelihood Ratio (LR) χ2 analysis, Bayesian coefficient magnitudes, and the variable inclusion order in Lasso regression applied to a Cox Proportional Hazards Model (PHM). The fourth method employs a Machine Learning (ML) technique, permutation importance applied to a Random Survival Forest (RSF) model. As a robustness check of the RSF model, a gradient boosting survival model was fitted. SHAP values were also computed for both ML models and compared with permutation importance scores to assess the stability and consistency of variable-importance rankings. The procedure was implemented on a real-world dataset from an oil refinery, consisting of 675 pumps with a set of 27 potential predictors. Both ordinal and weighted rankings were computed, with weighted rankings providing deeper insights than ordinal rankings by capturing the relative differences between variables. Lasso and Bayesian Cox exhibited the highest variability in these rankings, while the RSF method showed the lowest variability (mean inter-quartile range: 2.7) and the strongest correlation (0.787) with the average ranking across models. Maintenance work orders consistently emerged as the most influential predictor of MTBF followed by pumped fluid, discharge pressure, and manufacturing year. To address variability, the Copeland–Llull voting method was applied to individual and aggregated rankings, reducing dispersion and improving robustness. Bootstrap resampling further quantified uncertainty and confirmed the stabilizing effect of this technique. Although minor changes occurred in variable ordering, key predictors remained dominant. Grouping variables into six categories revealed maintenance as the most impactful category followed by operating conditions. Overall, this approach enhances ranking stability and provides actionable insights for reliability analysis.
Place, publisher, year, edition, pages
Institute of Physics, 2026
Keywords
centrifugal pumps, meantime between failures, reliability prediction, variable importance, feature selection
National Category
Probability Theory and Statistics
Research subject
Operation and Maintenance Engineering
Identifiers
urn:nbn:se:ltu:diva-117212 (URN)10.1088/2631-8695/ae56ce (DOI)001734296900001 ()2-s2.0-105035567877 (Scopus ID)
Note
Full text license: CC BY
2026-04-202026-04-202026-05-19