Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A hybrid fault detection and diagnosis method in server rooms’ cooling systems
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Computer Science.ORCID iD: 0000-0001-8185-7118
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Computer Science.ORCID iD: 0000-0003-0075-1608
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Computer Science.ORCID iD: 0000-0002-0799-2888
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Computer Science. ABB Corporate Research, Västerås, Sweden.ORCID iD: 0000-0003-2353-0752
Show others and affiliations
2019 (English)In: 2019 IEEE 17th International Conference on Industrial Informatics (INDIN), IEEE, 2019, p. 1405-1410Conference paper, Published paper (Other academic)
Abstract [en]

Data centers as all complex systems are prone to faults, and cost of them can be very high. This paper is focused on detecting the faults in the cooling systems, in particular on local fans level. In the paper, a hybrid approach is proposed. In the approach a model is used as substitute of the real system to generate dataset containing records of both normal and fault cases. On the generated data, machine learning algorithm or ensemble of algorithms are selected and trained to detect the faults. To demonstrate the approach, the rack model of real data center is created, and reliability of the model is shown. Using the model, the dataset with normal as well as abnormal records of data is generated. To detect faults of local fans, simple classifiers are built for all pairs: a local fan – a processor unit. Classifiers are trained on one part of generated data (training data), and then their accuracy is estimated on another part of generated data (test data). A real-time fault detection system is built based on the classifiers. The rack model is used as the substitute of the real plant to check operability of the system.

Place, publisher, year, edition, pages
IEEE, 2019. p. 1405-1410
Series
IEEE International Conference on Industrial Informatics (INDIN), ISSN 1935-4576, E-ISSN 2378-363X
Keywords [en]
data center, cooling system, fault detection, classification
National Category
Computer Sciences
Research subject
Dependable Communication and Computation Systems
Identifiers
URN: urn:nbn:se:ltu:diva-75445DOI: 10.1109/INDIN41052.2019.8971959ISI: 000529510400210Scopus ID: 2-s2.0-85079044437OAI: oai:DiVA.org:ltu-75445DiVA, id: diva2:1341389
Conference
2019 IEEE 17th International Conference on Industrial Informatics (INDIN), 22-25 July, 2019, Helsinki-Espoo, Finland
Note

ISBN för värdpublikation: 978-1-7281-2927-3, 978-1-7281-2928-0

Available from: 2019-08-08 Created: 2019-08-08 Last updated: 2022-08-30Bibliographically approved
In thesis
1. Simulation-based development of distributed control systems in energy-efficient data centres
Open this publication in new window or tab >>Simulation-based development of distributed control systems in energy-efficient data centres
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The main focus of this thesis is on the area of integrated automated control systems inmodern data centres. The data centres are mission-critical facilities since they provide services for transporting, storing and processing vast amounts of data, which can be considered the ”new oil” of the Industry 4.0 era. Reliability of data centres is crucial for providing their availability to customers; thus, they require the detecting and predicting faults and properly recovering from them on time or mitigating their effects. The sustainability of data centres is in reducing energy consumption and mitigating the negative impact on the environment. So the data centres require flexible management of IT- and cooling workload to save energy, as well as they are oriented on the use of renewable energy generation techniques and free cooling methods. Thus, the integrated automated control in modern data centres is expected to achieve sustainability and energy efficiency while maintaining reliability and availability. The thesis addresses the reliability and sustainability issues in modern data centres. The handling of such issues requires the development and validation of control strategies as well as the construction of comprehensive control and automation systems based on these strategies. Modern data centres have the modular architecture by providing clear and unified procedures for data centre components installation and replacement. Because of the modular structure of data centres, it is unreasonable for their control systems to remain centralised, static and rigid. Thus the thesis focuses on developing modular and flexible automation systems for data centres. Modular and flexible control assumes that controllers make their decisions autonomously based on their objectives and interact with each other to achieve some common goals for the holistic control system. Thus, the thesis’s first contribution is the proposition of a multi-agent control (MAC) as a distributed approach to implementing the required control functions by communication and interaction of controllers. This work suggests the general design of the multi-agent control, which focuses on base agents playing as individual controllers and interactions between the agents. The process of the automation system engineering requires progressive and continuous validation. The closed-loop approach, allowing the validation of the control system, uses a plant model as an essential part. The second contribution is a modular toolbox that enables building models of data centres of any scale and configuration with relative ease. The toolbox comprises Simulink blocks which model individual components of a regular data centre. Each block is a complete model of the corresponding component encapsulating all parameters and equations describing its behaviour. The system is extendable by adding new modifications to the existing blocks as well as by developing new blocks. Thus the constructed model is capable of substituting for the real data centre at examining the performance of different control strategies in a dynamic mode. And the third contribution, in addition to the modelling toolbox, the thesis also suggests a control toolbox, a set of Simulink blocks implementing the individual controllers, which utilise reinforcement learning algorithms. The control toolbox is capable of examining the different reinforcement learning algorithms and reward functions to select the most relevant ones to certain controllers. Thus the main outcome of the thesis is a collection of methods, algorithms and models enabling creation of the platform, which supports the development and validation of the distributed automated control systems for data centres. The platform is a modular toolbox aimed at constructing the data centre models and developing the control system in the data centre as a set of interacting autonomous agents. As well as the platform utilises the multi-agent approach as a promising approach in organising the agents’ interactions in both traditional methods, such as a voting procedure or an auction, and the multi-agent reinforcement learning approach.

Place, publisher, year, edition, pages
Luleå: Luleå University of Technology, 2022
Series
Doctoral thesis / Luleå University of Technology 1 jan 1997 → …, ISSN 1402-1544
National Category
Computer Sciences
Research subject
Dependable Communication and Computation Systems
Identifiers
urn:nbn:se:ltu:diva-92717 (URN)978-91-8048-133-5 (ISBN)978-91-8048-134-2 (ISBN)
Public defence
2022-10-18, E238, Luleå, 10:00 (English)
Opponent
Supervisors
Available from: 2022-08-31 Created: 2022-08-30 Last updated: 2022-09-23Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Berezovskaya, YuliaYang, Chen-WeiMousavi, ArashZhang, XiaojingVyatkin, Valeriy

Search in DiVA

By author/editor
Berezovskaya, YuliaYang, Chen-WeiMousavi, ArashZhang, XiaojingVyatkin, Valeriy
By organisation
Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 217 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf