Abstract
Online shopping has gained popularity due to its convenience and vast product selection, leading consumers to share reviews online. However, the rise of fake reviews undermines consumer trust and misleads potential buyers. Existing detection models often analyze full review text but fail to capture subtle cues such as exaggerated sentiment, duplicated review, or lack of specificity. Moreover, the significant class imbalance, where genuine reviews far outnumber fake ones, leads to biased models that struggle to detect deceptive content effectively. To address these limitations, this research introduces a novel hybrid approach, named HYB-ABSAFAKE, that integrates Bidirectional Encoder Representations from Transformers (BERT) for Implicit Aspect-Based Sentiment Analysis (ABSA), rule-based indicators of fake reviews and Synthetic Minority Over-sampling Technique (SMOTE) to handle imbalanced data. A Support Vector Machine (SVM) is used for classification, and the model is evaluated using kfold cross-validation. The dataset, obtained from Kaggle's Amazon Reviews Dataset, includes 2,582 reviews from four categories, including foods, home care, personal care, and refreshments. This is the first study to combine implicit ABSA, rule-based indicators, and SMOTE for fake review detection. The hybrid approach is compared with two feature approaches, which are SVM baseline model and a BERT + Rule-based + SVM without SMOTE. It achieved 96% accuracy, 60% precision, 100% recall, and a 75%) Fl score, demonstrating improved detection performance, particularly in identifying all fake reviews. The study contributes to fake review research by highlighting how implicit aspects and class imbalance influence deceptive patterns, thereby expanding the theoretical understanding of subtle linguistic cues. Methodologically, it presents a novel sequencing strategy and layered detection process that enhance both model interpretability and the detection of rare cases. Practically, the developed hybrid approach is scalable, explainable, and adaptable to real-world review platforms, supporting greater trust and more informed decision-making. Future work will explore larger annotated datasets, advanced resampling techniques such as Edited Nearest Neighbors, and adaptation of the hybrid approach to domain-specific areas like healthcare or education to improve practical relevance.
Metadata
| Item Type: | Thesis (Masters) |
|---|---|
| Creators: | Creators Email / ID Num. Abdul Rahim, Leena Ardini UNSPECIFIED |
| Contributors: | Contribution Name Email / ID Num. Thesis advisor Abu Samah, Khyrina Airin Fariza UNSPECIFIED |
| Subjects: | H Social Sciences > HF Commerce H Social Sciences > HF Commerce > Marketing > Social aspects. Social marketing |
| Divisions: | Universiti Teknologi MARA, Shah Alam > Faculty of Computer and Mathematical Sciences |
| Programme: | Master of Science (Computer Science) |
| Keywords: | Bidirectional Encoder Representations from Transformers (BERT), Aspect-Based Sentiment Analysis (ABSA), Synthetic Minority Over-sampling Technique (SMOTE) |
| Date: | 2025 |
| URI: | https://ir.uitm.edu.my/id/eprint/129183 |
Download
129183.pdf
Download (192kB)
Digital Copy
Physical Copy
ID Number
129183
Indexing
