A comparison of ordinary least squares and robust regression for predicting ozone concentration level

Muhamad, Muqhlisah (2017) A comparison of ordinary least squares and robust regression for predicting ozone concentration level. Masters thesis, Universiti Teknologi MARA (UiTM).

Abstract

Nowadays, air pollution in Malaysia has become a serious issue. The improvement of technology and economic development had contributed to the air pollution. Consequently, this air pollution problem will cause negative effects on human health, crop yield, and irritate ecosystem in Malaysia. Ground-level ozone (O₃) is one of the air pollutant of concern in our country. The gaseous of O₃ can cause a serious problem to the respiratory system. The multiple linear regression model has been widely used to predict the level of O₃ concentration. However, the traditional approach of ordinary least squares method (OLS) is sensitive to the presence of influential outliers in air pollution data which can lead to biased prediction. The influential outliers have been identified using standardized residual, Cook’s distance and leverage values. In order to overcome this problem, this study used the robust regression method to minimize the contamination of the influential outliers from y-direction, x-space and both y-direction and x-space using M-estimation, S-estimation and MM-estimation respectively in the prediction model of O₃ concentration for next day (D+1), next two days (D+2) and next three days (D+3). Two monitoring stations have been selected, urban area (Shah Alam) and industrial area (Pasir Gudang). This study uses daytime concentration data, from 7am to 7pm (12 hours average concentration) since the level of O₃ concentration was very active during morning until evening. The results of this study showed that robust regression method is better than OLS method after the evaluation of performance indicators of normalized absolute error (NAE), root mean square error (RMSE), index of agreement (IA) and prediction accuracy (PA). The average accuracy (IA and PA) of robust regression method for D+1 model and D+2 model have been improved by 0.11% and 0.07% respectively. Meanwhile, the average error (NAE and RMSE) for D+1 model, D+2 model and D+3 model have been decreased by 0.23%, 0.19% and 0.02%. This study has proved that robust regression method as a complementary to OLS method when the data contains influential outliers. Besides, this study also suggest robust regression method should be widely approached in air pollution study such as among researchers and Department of Environment.

Metadata

Item Type: Thesis (Masters)
Creators:
Creators
Email / ID Num.
Muhamad, Muqhlisah
2014237204
Contributors:
Contribution
Name
Email / ID Num.
Advisor
Mohamad Japeri, Ahmad Zia Ul-Saufie
UNSPECIFIED
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > Mathematical statistics. Probabilities > Prediction analysis
T Technology > TD Environmental technology. Sanitary engineering > Environmental pollution
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Master of Science
Keywords: Formation of ozone, Air pollution, Meteorological variables
Date: 2017
URI: https://ir.uitm.edu.my/id/eprint/120572
Edit Item
Edit Item

Download

[thumbnail of 120572.pdf] Text
120572.pdf

Download (352kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

120572

Indexing

Statistic

Statistic details