Examining the impact of feature selection techniques on machine and deep learning models for the prediction of COVID-19 / Hafiza Zoya Mojahid ... [et al.]

Mojahid, Hafiza Zoya and Mohamad Zain, Jasni and Yusoff, Marina and Basit, Abdul and Jumaat, Abdul Kadir and Ali, Mushtaq (2025) Examining the impact of feature selection techniques on machine and deep learning models for the prediction of COVID-19 / Hafiza Zoya Mojahid ... [et al.]. Malaysian Journal of Computing (MJoC), 10 (1): 12. pp. 2135-2158. ISSN 2600-8238

Abstract

Feature selection is a vital preprocessing step for identifying the most informative features in complex datasets, enhancing the efficiency and accuracy of machine learning models. Its applications extend across various domains, including big data analytics, finance, chemometrics, medical diagnostics, biological research, intrusion detection systems, and renewable energy solutions. In medical contexts, feature selection serves a dual purpose: it reduces dimensionality while simultaneously improving the comprehension of disease etiology. This study delves into key variable selection methods—specifically Recursive Feature Elimination (RFE), Principal Component Analysis (PCA) and Least Absolute Shrinkage and Selection Operator (LASSO). We evaluate the interaction of these methods with Support Vector Machines (SVM), Logistic Regression (LR), and eXtreme Gradient Boosting (XGBoost) for COVID-19 prediction. Key performance metrics, including F1-score, precision, recall, and accuracy. LASSO with SVM performed the best overall in terms of accuracy = 0.7679 and precision=0.8236, but PCA outperformed RFE with XGBoost, underscoring the importance of matching feature selection methods to model types. In addition, we employ a deep learning Feature Selection method based on Extreme Learning Machine (FSELM) and compare its effectiveness against the established feature selection techniques. Our work reveals that Lactate Dehydrogenase (LDH) is the most relevant feature while predicting COVID-19. This research aims to provide insights into the optimal integration of feature selection techniques with advanced machine learning models for accurate prediction of COVID-19 virus.

Metadata

Item Type: Article
Creators:
Creators
Email / ID Num.
Mojahid, Hafiza Zoya
2022547787@student.uitm.edu.my
Mohamad Zain, Jasni
jasni@.uitm.edu.my
Yusoff, Marina
marina998@uitm.edu.my
Basit, Abdul
2021691374@student.uitm.edu.my
Jumaat, Abdul Kadir
abdulkadir@tmsk.uitm.edu.my
Ali, Mushtaq
mushtaq.ali@riphah.edu.pk
Subjects: Q Science > Q Science (General) > Machine learning
Divisions: Universiti Teknologi MARA, Shah Alam > College of Computing, Informatics and Mathematics
Journal or Publication Title: Malaysian Journal of Computing (MJoC)
UiTM Journal Collections: Listed > Malaysian Journal of Computing (MJoC)
ISSN: 2600-8238
Volume: 10
Number: 1
Page Range: pp. 2135-2158
Keywords: COVID-19, Deep Learning, Extreme Learning Machine, Feature Selection, Machine Learning Models, Prediction
Date: April 2025
URI: https://ir.uitm.edu.my/id/eprint/112922
Edit Item
Edit Item

Download

[thumbnail of 112922.pdf] Text
112922.pdf

Download (1MB)

ID Number

112922

Indexing

Altmetric
PlumX
Dimensions

Statistic

Statistic details