HYB-ABSAFAKE: a hybrid approach integrating implicit aspect-based sentiment analysis with imbalanced dataset handling for enhancing fake review detection

Abdul Rahim, Leena Ardini (2025) HYB-ABSAFAKE: a hybrid approach integrating implicit aspect-based sentiment analysis with imbalanced dataset handling for enhancing fake review detection. Masters thesis, UiTM.

Abstract

Online shopping has gained popularity due to its convenience and vast product selection, leading consumers to share reviews online. However, the rise of fake reviews undermines consumer trust and misleads potential buyers. Existing detection models often analyze full review text but fail to capture subtle cues such as exaggerated sentiment, duplicated review, or lack of specificity. Moreover, the significant class imbalance, where genuine reviews far outnumber fake ones, leads to biased models that struggle to detect deceptive content effectively. To address these limitations, this research introduces a novel hybrid approach, named HYB-ABSAFAKE, that integrates Bidirectional Encoder Representations from Transformers (BERT) for Implicit Aspect-Based Sentiment Analysis (ABSA), rule-based indicators of fake reviews and Synthetic Minority Over-sampling Technique (SMOTE) to handle imbalanced data. A Support Vector Machine (SVM) is used for classification, and the model is evaluated using kfold cross-validation. The dataset, obtained from Kaggle's Amazon Reviews Dataset, includes 2,582 reviews from four categories, including foods, home care, personal care, and refreshments. This is the first study to combine implicit ABSA, rule-based indicators, and SMOTE for fake review detection. The hybrid approach is compared with two feature approaches, which are SVM baseline model and a BERT + Rule-based + SVM without SMOTE. It achieved 96% accuracy, 60% precision, 100% recall, and a 75%) Fl score, demonstrating improved detection performance, particularly in identifying all fake reviews. The study contributes to fake review research by highlighting how implicit aspects and class imbalance influence deceptive patterns, thereby expanding the theoretical understanding of subtle linguistic cues. Methodologically, it presents a novel sequencing strategy and layered detection process that enhance both model interpretability and the detection of rare cases. Practically, the developed hybrid approach is scalable, explainable, and adaptable to real-world review platforms, supporting greater trust and more informed decision-making. Future work will explore larger annotated datasets, advanced resampling techniques such as Edited Nearest Neighbors, and adaptation of the hybrid approach to domain-specific areas like healthcare or education to improve practical relevance.

Metadata

Item Type: Thesis (Masters)
Creators:
Creators
Email / ID Num.
Abdul Rahim, Leena Ardini
UNSPECIFIED
Contributors:
Contribution
Name
Email / ID Num.
Thesis advisor
Abu Samah, Khyrina Airin Fariza
UNSPECIFIED
Subjects: H Social Sciences > HF Commerce
H Social Sciences > HF Commerce > Marketing > Social aspects. Social marketing
Divisions: Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences
Programme: Master of Science (Computer Science)
Keywords: Bidirectional Encoder Representations from Transformers (BERT), Aspect-Based Sentiment Analysis (ABSA), Synthetic Minority Over-sampling Technique (SMOTE)
Date: 2025
URI: https://ir.uitm.edu.my/id/eprint/129183
Edit Item
Edit Item

Download

[thumbnail of 129183.pdf] Text
129183.pdf

Download (192kB)

Digital Copy

Digital (fulltext) is available at:

Physical Copy

Physical status and holdings:
Item Status:

ID Number

129183

Indexing

Statistic

Statistic details