Addressing class and demographic imbalance in e-commerce behavior prediction: a case study using resampling techniques

Mustakim, Nurul Ain and Abdul Aziz, Maslina and Abdul Rahman, Shuzlina and Rahmiati, Rahmiati (2026) Addressing class and demographic imbalance in e-commerce behavior prediction: a case study using resampling techniques. Malaysian Journal of Computing (MJoC), 11 (1): 1. pp. 2339-2347. ISSN 2600-8238

Identification Number (DOI): 10.24191/mjoc.vo11i1.8077

Abstract

In e-commerce predictive modeling, imbalanced data remains a critical challenge, particularly when both class labels and demographic attributes are unequally distributed. This study investigates a combined approach of Synthetic Minority Oversampling Technique (SMOTE) and demographic resampling to improve the performance of models predicting online purchasing behavior in Malaysia. Using a dataset of 1,126 survey responses, six classifiers (J48, Random Tree, REPTree, JRip, PART, and OneR) were evaluated under three conditions: unbalanced, after SMOTE, and after SMOTE with demographic balancing. The results displayed clear improvements in model performance. For example, J48’s accuracy increased from 62.85% (unbalanced) to 98.69% (fully balanced), while Random Tree achieved 99.29%. These results highlight the effectiveness of integrating class and demographic balancing, an approach seldom explored in e-commerce analytics. This study contributes by demonstrating how addressing both types of imbalances yields more reliable predictive model, offering practical insights for consumer segmentation, targeting, and personalization. Future work could extend this approach by balancing additional attributes and applying it to ensemble or deep learning models for improved robustness and interpretability.

Metadata

Item Type: Article
Creators:
Creators
Email / ID Num.
Mustakim, Nurul Ain
ainmustakim@melaka.uitm.edu.my
Abdul Aziz, Maslina
maslina_aziz@uitm.edu.my
Abdul Rahman, Shuzlina
shuzlina@uitm.edu.my
Rahmiati, Rahmiati
rahmiati@fe.unp.ac.id
Subjects: H Social Sciences > HF Commerce
H Social Sciences > HF Commerce > Electronic commerce
Divisions: Universiti Teknologi MARA, Shah Alam > College of Computing, Informatics and Mathematics
Journal or Publication Title: Malaysian Journal of Computing (MJoC)
UiTM Journal Collections: UiTM Journals > Malaysian Journal of Computing (MJoC)
ISSN: 2600-8238
Volume: 11
Number: 1
Page Range: pp. 2339-2347
Keywords: Classification, Consumer behavior, Data imbalance, Demographic resampling, E-Commerce, SMOTE
Date: April 2026
URI: https://ir.uitm.edu.my/id/eprint/136297
Edit Item
Edit Item

Download

[thumbnail of 136297.pdf] Text
136297.pdf

Download (260kB)

ID Number

136297

Indexing

Altmetric
PlumX
Dimensions

Statistic

Statistic details