Web scraping and regression analysis based on machine learning for COVID-19 with rapid software platform / Aizal Yusrina Idris, Razan Bamoallem and Mohamad Harith Azfar Mohamad Hatta

Idris, Aizal Yusrina and Bamoallem, Razan and Mohamad Hatta, Mohamad Harith Azfar (2022) Web scraping and regression analysis based on machine learning for COVID-19 with rapid software platform / Aizal Yusrina Idris, Razan Bamoallem and Mohamad Harith Azfar Mohamad Hatta. Mathematical Sciences and Informatics Journal (MIJ), 3 (1). pp. 75-85. ISSN 2735-0703

Abstract

Since the recent incidence of global COVID-19 pandemic, expertise from different domains including scientists, clinicians, and healthcare experts keep on exploring for technologies to manage the COVID-19 data. Updated and accurate data collection is very critical for them to make a more effective and efficient decision on any aspects of the emergency consequences and events. Although some of them are inexpert data scientists, the important skills and knowledge to extract the recent data on COVID-19 is web data extraction and analysis. While tremendous of literature can be referred from the academic databases, it is difficult to find the report that presents the basis and fundamental methods for implementing web data analysis in a simple way with a rapid software platform. This paper demonstrates a simple framework for implementing web data extraction or web scraping to be analyzed in a rapid software platform. Python scripting language is the simple tool to conduct the web scraping method while RapidMiner is the rapid software for implementing the data visualization and analysis. Simple linear regression based on machine learning approach has been implemented with the RapidMiner to predict COVID-19 death based on the collected data. This paper will be useful for academicians and industry practitioners to conduct a more robust data analysis to accommodate a more challenge issue such as big data analytics in any domains.

Metadata

Item Type: Article
Creators:
Creators
Email / ID Num.
Idris, Aizal Yusrina
idrisa@rcyci.edu.sa
Bamoallem, Razan
bamoallemr@rcyci.edu.sa
Mohamad Hatta, Mohamad Harith Azfar
2021782731@student.uitm.edu.my
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > Multivariate analysis. Cluster analysis. Longitudinal method
Q Science > QA Mathematics > Multivariate analysis. Cluster analysis. Longitudinal method > Regression analysis. Correlation analysis. Spatial analysis (Statistics)
Divisions: Universiti Teknologi MARA, Perak > Tapah Campus > Faculty of Computer and Mathematical Sciences
Journal or Publication Title: Mathematical Sciences and Informatics Journal (MIJ)
UiTM Journal Collections: UiTM Journal > Mathematical Science and Information Journal (MIJ)
ISSN: 2735-0703
Volume: 3
Number: 1
Page Range: pp. 75-85
Keywords: Web scraping; Regression analysis; Machine learning; COVID-19; Python; RapidMiner
Date: May 2022
URI: https://ir.uitm.edu.my/id/eprint/61730
Edit Item
Edit Item

Download

[thumbnail of 61730.pdf] Text
61730.pdf

Download (981kB)

ID Number

61730

Indexing

Statistic

Statistic details