The enhancement of normal ratio method through multiple imputation approach in estimating missing data with outliers for Peninsular Malaysian rainfall dataset / Siti Nur Zahrah Amin Burhanuddin

Amin Burhanuddin, Siti Nur Zahrah (2020) The enhancement of normal ratio method through multiple imputation approach in estimating missing data with outliers for Peninsular Malaysian rainfall dataset / Siti Nur Zahrah Amin Burhanuddin. PhD thesis, Universiti Teknologi MARA.

Abstract

The complete rainfall dataset is very important in representing the climatological characteristics precisely, especially for hydrological and meteorological studies. It is also contributed to effective and efficient environmental management. However, the rainfall data is highly vulnerable to the missing problem due to the dynamic process of the climatic variable. Furthermore, the data is exposed to the seasonal activities that could contribute to the uncertainty and irregularity variations in the rainfall amount which will cause the presence of outliers in the dataset. These situations will affect the quality of the rainfall dataset and subsequently provide inaccurate information to the users. Concerning this situation, this study attempts to develop a practical and reliable approach to treat the missing values in the effort to provide a good quality dataset for the public domain. Spatial estimation method, i.e. normal ratio method was considered in this study to estimate the missing rainfall data. Various efforts were proposed to improve the performance of the method, however, there are lacking works on robustifying the method so that it can perform well for the dataset that contains outliers. Therefore, this study aims to propose the enhancement of normal ratio methods for imputing the missing values in the daily rainfall dataset with outliers. The robust statistics (i.e. trimmed mean, median, and geometric median) were adopted in the proposed methods to make them less affected by the outliers. The normal ratio method was commonly implemented through single imputation approach, but this approach encounters with the limitation of not considering uncertainty in missing values. Thus, this study has proposed a multiple imputation approach based on block bootstrap to overcome the limitation of single imputation approach as well as improving the performance of the existing multiple imputation approach incorporated in Amelia package. Block bootstrap was firstly introduced in the proposed multiple imputation approach (named as NRMI-Bboot) to enhance the performance when dealing with the rainfall time series. The performance of each estimation method was evaluated based on five performance criteria at six different levels of missing data (5%, 10%, 15%. 20%, 25%, and 30%) and three levels of outlying data (5%, 10%, and 15%) that have been created in the dataset. Complete 40 years daily rainfall data from 22 meteorology stations were considered for the analysis purpose. Four target stations were selected as the representative of the main regions in Peninsular Malaysia (northwest, east, west, and southwest). The capability of the estimation methods was further verified using distribution fitting. The adoption of the robust statistics in the proposed estimation methods associated with the NRMI-Bboot approach has provided an improvement to the estimation results, especially when dealing with the dataset that contains extreme outliers. The block bootstrap ensured that the original rainfall time series structure was preserved within each monsoon block and consequently produced more accurate estimation results. This indicates the advantages of the proposed estimation methods and multiple imputation approach in their role of providing accurate imputed values for missingness in Peninsular Malaysian daily rainfall dataset.

Metadata

Item Type: Thesis (PhD)
Creators:
Creators
Email / ID Num.
Amin Burhanuddin, Siti Nur Zahrah
2012678162
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Mohd Deni, Sayang (Assoc. Prof. Dr.)
UNSPECIFIED
Subjects: Q Science > QA Mathematics > Mathematical statistics. Probabilities > Data processing
Q Science > QC Physics > Meteorology. Climatology. Including the earth's atmosphere > Rain and rainfall
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Doctor of Philosophy in Information Technology and Quantitative Sciences
Keywords: Rainfall data measurement; data quality
Date: October 2020
URI: https://ir.uitm.edu.my/id/eprint/60924
Edit Item
Edit Item

Download

[thumbnail of 60924.pdf] Text
60924.pdf

Download (43kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

60924

Indexing

Statistic

Statistic details