Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Advanced Data Analytics Modelling for Air Quality Assessment
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering.
2023 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

 Air quality assessment plays a crucial role in understanding the impact of air pollution onhuman health and the environment. With the increasing demand for accurate assessment andprediction of air quality, advanced data analytics modelling techniques offer promisingsolutions. This thesis focuses on leveraging advanced data analytics to assess and analyse airpollution concentration levels in Italy over a 4km resolution using the FORAIR_IT datasetsimulated in ENEA on the CRESCO6 infrastructure, aiming to uncover valuable insights andidentifying the most appropriate AI models for predicting air pollution levels. The datacollection, understanding, and pre-processing procedures are discussed, followed by theapplication of big data training and forecasting using Apache Spark MLlib. The research alsoencompasses different phases, including descriptive and inferential analysis to understand theair pollution concentration dataset, hypothesis testing to examine the relationship betweenvarious pollutants, machine learning prediction using several regression models and anensemble machine learning approach and time series analysis on the entire dataset as well asthree major regions in Italy (Northern Italy – Lombardy, Central Italy – Lazio and SouthernItaly – Campania). The computation time for these regression models are also evaluated and acomparative analysis is done on the results obtained. The evaluation process and theexperimental setup involve the usage of the ENEAGRID/CRESCO6 HPC Infrastructure andApache Spark. This research has provided valuable insights into understanding air pollutionpatterns and improving prediction accuracy. The findings of this study have the potential todrive positive change in environmental management and decision-making processes, ultimatelyleading to healthier and more sustainable communities. As we continue to explore the vastpossibilities offered by advanced data analytics, this research serves as a foundation for futureadvancements in air quality assessment in Italy and the models are transferable to other regionsand provinces in Italy, paving the way for a cleaner and greener future. 

Place, publisher, year, edition, pages
2023. , p. 156
Keywords [en]
Air quality assessment, Advanced Data Analytics, Artificial Intelligence (AI), Machine Learning (ML), Big Data, Regression Models, Time Series Models, High Performance Computing (HPC), Air Pollution
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:ltu:diva-101490OAI: oai:DiVA.org:ltu-101490DiVA, id: diva2:1801320
External cooperation
ENEA Casaccia Research Center, Italy; Leeds Beckett University, United Kingdom
Subject / course
Student thesis, at least 30 credits
Educational program
Master Programme in Green Networking and Cloud Computing
Presentation
2023-06-14, Municipality City Hall, Anacapri, Italy, Anacapri, 15:00 (English)
Supervisors
Examiners
Available from: 2023-10-06 Created: 2023-09-29 Last updated: 2023-10-09Bibliographically approved

Open Access in DiVA

No full text in DiVA

By organisation
Department of Computer Science, Electrical and Space Engineering
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 251 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf