Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Data clustering and imputing using a two-level multi-objective genetic algorithms (GA): A case study of maintenance cost data for tunnel fans
Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Operation, Maintenance and Acoustics.ORCID iD: 0000-0002-1967-6604
Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Operation, Maintenance and Acoustics.ORCID iD: 0000-0001-5620-5265
Department of Industrial Engineering, School of Mechanical Engineering, Dongguan University of Technology, 523808 Dongguan, China.ORCID iD: 0000-0001-5317-0087
2018 (English)In: Cogent Engineering, E-ISSN 2331-1916, Vol. 5, no 1, p. 1-16, article id 1513304Article in journal (Refereed) Published
Abstract [en]

Data clustering captures natural structures in data consisting of a set of objects and groups similar data together. The derived clusters can be used for scale analysis and to posit missing data values in objects, as missing data have a negative effect on the computational validity of models. This study develops a new two-level multi-objective genetic algorithm (GA) to optimize clustering in order to redact and impute missing cost data for fans used in road tunnels by the Swedish Transport Administration (Trafikverket). The first level uses a multi-objective GA based on fuzzy c-means to cluster cost data objects based on three main indices. The first is cluster centre outliers; the second is the compactness and separation ( ) of the data points and cluster centres; the third is the intensity of data points belonging to the derived clusters. Our clustering model is validated using k-means clustering. The second level uses a multi-objective GA to impute the missing cost redacted data in size using a valid data period. The optimal population has a low , 0.1%, and a high intensity, 99%. It has three cluster centres, with the highest data reduction of 27%. These three cluster centres have a suitable geometry, so the cost data can be partitioned into relevant contents to be redacted for imputing. Our model show better clustering detection and evaluation compared with k-means. The amount of missing data for the two cost objects are: labour 57%, materials 81%. The second level shows highly correlated data (R-squared 0.99) after imputing the missing data objects. Therefore, multi-objective GA can cluster and impute data to derive complete data that can be used for better estimation of forecasting.

Place, publisher, year, edition, pages
Taylor & Francis, 2018. Vol. 5, no 1, p. 1-16, article id 1513304
Keywords [en]
Data clustering, data imputing, multi-objective GA, fuzzy c-means, K-means clustering
National Category
Reliability and Maintenance Other Civil Engineering
Research subject
Operation and Maintenance Engineering
Identifiers
URN: urn:nbn:se:ltu:diva-70375DOI: 10.1080/23311916.2018.1513304ISI: 000444436800001Scopus ID: 2-s2.0-85052696347OAI: oai:DiVA.org:ltu-70375DiVA, id: diva2:1238830
Note

Validerad;2018;Nivå 2;2018-10-08 (johcin) 

Available from: 2018-08-14 Created: 2018-08-14 Last updated: 2023-09-04Bibliographically approved
In thesis
1. Two-Level Multi-Objective Genetic Algorithm for Risk-Based Life Cycle Cost Analysis
Open this publication in new window or tab >>Two-Level Multi-Objective Genetic Algorithm for Risk-Based Life Cycle Cost Analysis
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Artificial intelligence (AI) is one of the fields in science and engineering and encompasses a wide variety of subfields, ranging from general areas (learning and perception) to specific topics, such as mathematical theorems. AI and, specifically, multi-objective genetic algorithms (MOGAs) for risk-based life cycle cost (LCC) analysis should be performed to estimate the optimal replacement time of tunnel fan systems, with a view towards reducing the ownership cost and the risk cost and increasing company profitability from an economic point of view. MOGA can create systems that are capable of solving problems that AI and LCC analyses cannot accomplish alone.

The purpose of this thesis is to develop a two-level MOGA method for optimizing the replacement time of reparable system. MOGA should be useful for machinery in general and specifically for reparable system. This objective will be achieved by developing a system that includes a smart combination of techniques by integrating MOGA to yield the optimized replacement time. Another measure to achieve this purpose is implementing MOGA in clustering and imputing missing data to obtain cost data, which could help to provide proper data to forecast cost data for optimization and to identify the optimal replacement time.

In the first stage, a two-level MOGA is proposed to optimize clustering to reduce and impute missing cost data. Level one uses a MOGA based on fuzzy c-means to cluster cost data objects based on three main indices. The first is cluster centre outliers; the second is the compactness and separation ( ) of the data points and cluster centres; the third is the intensity of data points belonging to the derived clusters. Level two uses MOGA to impute the missing cost data by using a valid data period from that are reduced data in size. In the second stage, a two-level MOGA is proposed to optimize time series forecasting. Level one implements MOGA based on either an autoregressive integrated moving average (ARIMA) model or a dynamic regression (DR) model. Level two utilizes a MOGA based on different forecasting error rates to identify proper forecasting. These models are applied to simulated data for evaluation since there is no control of the influenced parameters in all of the real cost data. In the final stage, a two-level MOGA is employed to optimize risk-based LCC analysis to find the optimal replacement time for reparable system. Level one uses a MOGA based on a risk model to provide a variation of risk percentages, while level two uses a MOGA based on an LCC model to estimate the optimal reparable system replacement time.

The results of the first stage show the best cluster centre optimization for data clustering with low  and high intensity. Three cluster centres were selected because these centres have a geometry that is suitable for the highest data reduction of 27%. The best optimized interval is used for imputing missing data. The results of the second stage show the drawbacks of time series forecasting using a MOGA based on the DR model. The MOGA based on the ARIMA model yields better forecasting results. The results of the final stage show the drawbacks of the MOGA based on a risk-based LCC model regarding its estimation. However, the risk-based LCC model offers the possibility of optimizing the replacement schedule.

However, MOGA is highly promising for allowing optimization compared with other methods that were investigated in the present thesis.

Place, publisher, year, edition, pages
Luleå: Luleå University of Technology, 2019. p. 141
Series
Doctoral thesis / Luleå University of Technology 1 jan 1997 → …, ISSN 1402-1544
Keywords
Artificial intelligence (AI), Life cycle cost (LCC), Machine learning (ML), Multi-objective genetic algorithm (MOGA), Risk-based life cycle cost (LCC), Tunnel fans, Two-level system
National Category
Computer Sciences Reliability and Maintenance Other Civil Engineering
Research subject
Operation and Maintenance Engineering
Identifiers
urn:nbn:se:ltu:diva-76172 (URN)978-91-7790-454-0 (ISBN)978-91-7790-455-7 (ISBN)
Public defence
2019-12-06, F1031, Lulea, Porsön, Luleå, 10:00 (English)
Opponent
Supervisors
Available from: 2019-09-30 Created: 2019-09-30 Last updated: 2024-03-28Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Al-Douri, Yamur K.Hamodi, Hussan

Search in DiVA

By author/editor
Al-Douri, Yamur K.Hamodi, HussanZhang, Liangwei
By organisation
Operation, Maintenance and Acoustics
In the same journal
Cogent Engineering
Reliability and MaintenanceOther Civil Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 780 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf